Language Models

Scaling transformers for long futures (TBC)

This is an ongoing blog where I explore and improve my understanding of the language models:

I’ll keep sharing interesting topics I discover about language models over time.

TBD

BPE tokenizen

Scaling laws

The flops calculus of Language Model Training

Will’s blog

Distributed Training

Prompting techniques

LLM Evaluations

####