AI Blog

AI Blog

by Michele Laurelli

Masked Language Modeling

Task
Definition

Pre-training task where random tokens are masked and model predicts them from context.

15% of tokens masked: 80% [MASK], 10% random, 10% unchanged. BERT's pre-training objective. Enables bidirectional context.

Examples

1

BERT pre-training

2

Cloze task

3

Bidirectional language understanding

Michele Laurelli - AI Research & Engineering