Learn Fine-Tuning — A Hands-On Course

19 lessons across 9 modules. Self-contained. CPU or GPU. From transformers to RLHF, DPO, LoRA, QLoRA, and production deployment.

Open source • MIT-spirit educational

View on GitHub →

At a glance

Format	19 lessons, 9 modules — each lesson in three formats (.md + .py + .ipynb)
Stack	PyTorch, Hugging Face Transformers, PEFT, TRL, Accelerate, Datasets, Evaluate
Hardware	Runs on a laptop CPU; GPU optional for speed
Models used	distilbert-base-uncased, bert-base-uncased, gpt2, t5-small (60M–124M parameters)
Prerequisites	Python 3.10+, basic neural network knowledge
License	MIT-spirit educational

What the course covers

The course is organised into nine modules that build on each other, progressing from foundational concepts through advanced techniques to production deployment.

Module 1 — Foundations. Builds the mental model: how transformer architectures work, what tokenisation actually does (BPE and WordPiece), and how the Hugging Face ecosystem fits together. By the end of this module the abstractions you’ll lean on for the rest of the course are no longer black boxes.

Module 2 — Transfer Learning. The two fundamental approaches sit side by side here: freeze the backbone and train a head (feature extraction), or unfreeze everything and use discriminative learning rates, warmup, and cosine schedules (full fine-tuning). Catastrophic forgetting and how to mitigate it is covered explicitly.

Module 3 — Supervised Fine-Tuning. Instruction tuning with Alpaca and ChatML formats using TRL’s SFTTrainer, response-only loss masking, and continued pretraining for domain adaptation — measured properly via perplexity before and after.

Module 4 — Parameter-Efficient Fine-Tuning. The PEFT family, end to end. The maths behind LoRA’s low-rank decomposition, rank vs. target-module trade-offs, adapter save/load workflows, and then QLoRA on top — 4-bit NF4 quantisation, double quantisation, and a proper memory-savings analysis.

Module 5 — Prompt Tuning & Few-Shot. Soft prompts via PromptTuningConfig and virtual token embeddings, with PCA visualisation of what the model actually learns. Few-shot and in-context learning sit alongside, covering zero-shot through few-shot progression and example-selection strategies.

Module 6 — Alignment. The two techniques behind modern aligned models. RLHF reward modelling from pairwise preferences (Bradley-Terry), and DPO — the direct-preference rearrangement that bypasses the reward model — via TRL’s DPOTrainer, with beta tuning and the implicit reward signal explained.

Module 7 — Data Engineering. Building and curating instruction datasets is where most real fine-tuning projects live or die. Format conversion across Alpaca / ShareGPT / ChatML, MinHash deduplication, quality filtering, JSONL pipelines, and multi-turn handling.

Module 8 — Evaluation. Measuring what matters: accuracy, precision, recall, F1, and confusion matrices for classification; ROUGE-1/2/L, BERTScore, and perplexity for generation; plus a section on when human evaluation is the only honest answer.

Module 9 — Production. Shipping the model: save_pretrained and from_pretrained, LoRA adapter save/load, merge_and_unload, the safetensors format, Hub publishing with proper model cards, and inference optimisation through torch.compile, INT8 quantisation, batching, ONNX Runtime, and an overview of vLLM and TGI for serving.

Design principles

Self-contained. No API keys, no cloud accounts, no external datasets. Every script generates its own synthetic data, auto-detects the device (CUDA → MPS → CPU), and sets seeds for reproducibility. Clone, install, run. The friction between deciding to learn and seeing a result is as close to zero as the tooling allows.

Concept first, code second. Every lesson opens with theory and intuition — the maths, the trade-offs, the analogies, the ASCII diagrams — and only then introduces the code. The goal is to understand why a technique works, not just how to call its API.

Three formats per lesson. Each lesson exists as a richly annotated .md explanation, a clean .py script with WHY-focused inline comments, and a ready-to-run .ipynb notebook. Read for understanding, run cell-by-cell for exploration, edit the script when you’re ready to make it your own.

Small models, real patterns. All lessons use models in the 60M–124M parameter range — distilbert-base-uncased, bert-base-uncased, gpt2, t5-small — so every lesson runs comfortably on a laptop. The patterns transfer directly to 7B, 13B, and 70B+ models; the only thing that changes is the VRAM bill.

How each lesson works

Every lesson lives in its own folder containing three files that reinforce the same material differently: lesson_XX_topic.md for the deep theory explanation, lesson_XX_topic.py for the clean executable script, and lesson_XX_topic.ipynb for the cell-by-cell notebook. The recommended workflow is read → run → experiment: read the markdown to internalise the intuition, run the notebook to see the concepts come alive with real model outputs, then change something — a hyperparameter, the model, the synthetic data — and watch what moves.

Get started

# 1. Clone the repo
git clone https://github.com/PowerAI-Labs/Learn-Fine-Tuning-A-Hands-On-Course.git
cd Learn-Fine-Tuning-A-Hands-On-Course

# 2. Create a virtual environment
python -m venv venv
# Activate it — Windows:
venv\Scripts\activate
# Activate it — macOS / Linux:
source venv/bin/activate

# 3. Install core dependencies (all lessons)
pip install torch transformers datasets peft trl accelerate evaluate scikit-learn matplotlib rouge-score

# 4. Optional — quantisation & ONNX (lessons 10 and 19)
pip install bitsandbytes optimum onnxruntime

# 5. Open the first lesson
cd lessons/module_01_foundations/lesson_01_introduction
jupyter notebook lesson_01_introduction.ipynb

Every script generates its own synthetic data, auto-detects CUDA → MPS → CPU, sets seeds to 42, and prints step-by-step output as it runs.

Acknowledgements

This course stands on the shoulders of the Hugging Face ecosystem (Transformers, Datasets, PEFT, TRL, Accelerate, Evaluate, Hub), the PyTorch project, and roughly thirty research papers spanning Vaswani et al.’s Attention Is All You Need through Rafailov et al.’s DPO. The full reading list — papers, libraries, and the tutorials and courses that shaped the lessons — lives in the Acknowledgements section of the repo README.

Repository: github.com/PowerAI-Labs/Learn-Fine-Tuning-A-Hands-On-Course · Licensed under MIT-spirit educational use.

Learn Fine-Tuning — A Hands-On Course

At a glance

What the course covers

Design principles

How each lesson works

Get started

Acknowledgements

Share this:

Comments

Leave a comment Cancel reply

More posts

Deploying, evaluating, and calling models in Microsoft Foundry: a production guide for architects

The case of the 6,000 orphaned contacts: debugging GAB dual-write in Dynamics 365

Copilot Cowork: the agent that does the work — and the extensibility model architects should actually study

Microsoft IQ: the intelligence layer your agents inherit — and what it actually changes for enterprise AI builders