Reasoning in LLMs
Description
In this workshop we will learn what are and how to build large language ‘reasoning’ models. After the OpenAI’s o1 models, Gemini Thinking models and recent Grok3 models releases, so called ‘thinking’ or ‘reasoning’ LLMs became the latest buzz in the AI community. The DeepSeek family of models recently caused a lot of attention and their models have been open sourced as well as their training recipes. During the workshop we will demystify how these models are trained and use open source models and frameworks to replicate some of the concepts used in DeepSeek training.
We will start with the model that is not particularly good at math and by using the publicly available models, datasets and frameworks, we will perform supervised finetuning (SFT) and Reinforcement Learning methods to train a reasoning model capable of solving math word problems.
We will learn what are reasoning models, theory behind them, explain how and why Chain-of-Thought works, how to preprocess data, evaluate model, leverage techniques like, quantization and lora to train big models with a very small resource footprint (for GPU poor), track experiments and push models to HugginFace hub, so you will ‘take home’ your own reasoning LLM!
Preparations and requirements
- Activate Google Colab Pro at least a day before the workshop (cost around 10 EUR + VAT)
- Create an account on the www.wandb.com - we will use this to track model training and experiments
- Create an account at www.huggingface.com - we will use this to upload models we trained
- Create a Github account if you do not have one yet
Speaker
