This is an introductory lecture for the course "CS336, Language Models from Scratch," co-taught by Percy Liang and Tatsu. The lecture introduces the core staff, including TAs, and outlines the course's goals, which include providing a foundational understanding of language models by building them from scratch. It addresses the increasing disconnection between researchers and underlying technology due to reliance on proprietary models. The lecture also discusses the challenges of scale in language models, the importance of efficiency, and the different types of knowledge the course aims to impart: mechanics, mindset, and intuitions. The course covers tokenization, model architecture, training, system optimization, scaling laws, data curation, and alignment, with a focus on maximizing efficiency given hardware and data constraints. The lecture concludes with a detailed overview of tokenization, including character-based, byte-based, word-based, and BPE encoding methods.