Masked Diffusions Are Multi-Step BERTs

How to build a mini masked diffusion model with BERT

TBD