Description
by Andriy Burkov, the follow-up to his bestselling
(now in 12 languages), offers a concise yet thorough journey from language modeling fundamentals to the cutting edge of modern Large Language Models (LLMs). Within Andriy’s famous “hundred-page” format, readers will master both theoretical concepts and practical implementations, making it an invaluable resource for developers, data scientists, and machine learning engineers.
allows you to:
– Master the mathematical foundations of modern machine learning and neural networks
– Build and train three architectures of language models in Python
– Understand and code a Transformer language model from scratch in PyTorch
– Work with LLMs, including instruction finetuning and prompt engineering
Written in a hands-on style with working Python code examples, this book progressively builds your understanding from basic machine learning concepts to advanced language model architectures. All code examples run on Google Colab, making it accessible to anyone with a modern laptop.
Language models have evolved from simple n-gram statistics to become one of the most transformative technologies in AI, rivaling only personal computers in their impact. This book spans the complete evolution〞from count-based methods to modern Transformer architectures〞delivering a thorough understanding of both how these models work and how to implement them.
The Hundred-Page Language Models Book takes a unique approach by introducing language modeling concepts gradually, starting with foundational methods before advancing to modern architectures. Each chapter builds upon the previous one, making complex concepts accessible through clear explanations, diagrams, and practical implementations.
– Essential machine learning and neural network fundamentals
– Text representation techniques and basic language modeling
– Implementation of RNNs and Transformer architectures with PyTorch
– Practical guidance on finetuning language models and prompt engineering
– Important considerations on hallucinations and ways to evaluate models
– Additional resources for advanced topics through the book’s wiki
The complete code and additional resources are available through the book’s website at
.
Readers should have programming experience in Python. While familiarity with PyTorch and tensors is helpful, it’s not required. College-level math knowledge is beneficial, but the book presents mathematical concepts intuitively with clear examples and diagrams.
, Internet pioneer and Turing Award recipient:
, the author of
and
:
, co-founder and CEO at Dataiku:
, co-founder and CEO at LlamaIndex:
More endorsements from AI leaders on
.







Reviews
There are no reviews yet.