Sustainable Energy Systems
Dr. Kwame Asare
Progress
12%
Module 5 · Lesson 2
Transformer architectures
In this lesson we build a small transformer from scratch in PyTorch — embeddings, multi-head attention, the residual stack, and finally a training loop on a tiny Swahili-English parallel corpus.
Resources
- Lecture slides (PDF)
- Reference notebook (.ipynb)
- Reading: Vaswani et al. — Attention Is All You Need