This coding workshop focuses on building large language models (LLMs) from the ground up, starting with data preparation and ending with an instruction fine-tuned LLM. The workshop covers tokenizing text, coding the LLM architecture, pre-training a small LLM, loading pre-trained weights, and fine-tuning LLMs for specific instructions. The workshop uses PyTorch and TickToken, and relies heavily on the presenter’s book, "Build a Large Language Model from Scratch." The presenter demonstrates how to tokenize text, convert tokens into IDs, and prepare data for LLM training. The presenter also shows how to load pre-trained weights from OpenAI and use LitGPT to work with more sophisticated LLMs. The workshop concludes with instruction finetuning and model evaluation using MMLU.
Sign in to continue reading, translating and more.
Continue