by Michele Laurelli
Pre-training task predicting whether sentence B follows sentence A.
Binary classification: 50% consecutive, 50% random pairs. Originally used in BERT but later found less useful than MLM alone.
BERT pre-training task
Sentence relationship
Discourse understanding