Practical Notes on Transformer Implementation

A short technical note on building decoder-style transformers from scratch.