LLM internals

How does an LLM work, briefly?

LLM explained briefly in 8 minutes. No worries, in the coming days we'll dive deeper. What are parameters? How is an LLM roughly structured? How are LLMs trained? What are tokens? Large Language Models explained briefly https://www.youtube.com/watch?v=LPZh9BOjkQs

What are tokens? What does a transformer do with them, and what is a transformer anyway?

27-minute explanation about transformers: What are they, what are tokens, how do they work together? Great insight into the importance and functioning of tokens. Transformers (how LLMs work) explained visually | DL5 https://www.youtube.com/watch?v=wjZofJX0v4M

"Attention is all you need", but what is attention for an LLM?

26 minutes on how an LLM uses "attention." What is the difference with "classical" deep learning, how does attention help analyze language, what are Key, Query, and Value layers? Great insight into the origins of LLMs' understanding of language.

And source of the quote: https://en.wikipedia.org/wiki/Attention_Is_All_You_Need , paper by Google that marked the start of the transformer & LLM era. Attention in transformers, step-by-step | DL6 https://www.youtube.com/watch?v=eMlx5fFNoYc

Where does an LLM store its knowledge, and why can that lead to hallucinations?

22 minutes on the layers between transformers. What does this do to "knowledge"? Why and how could an LLM store knowledge, and why can this lead to hallucinations? Great insight into how "superposition" provides efficiency, but can also lead to hallucinations. Also a good explanation of why LLMs scale so well with extra dimensions (and therefore parameters). How might LLMs store facts | DL7 https://www.youtube.com/watch?v=9-Jl0dxWQs8

Interactive

What are good (and bad) ways to integrate AI into an existing interface?

Interactive article about properly integrating AI into, for example, Gmail. AI Horseless Carriages https://koomen.dev/essays/horseless-carriages/

Can I interactively follow an LLM at work?

Interactive site to follow a few tokens through a toy LLM. LLM Visualization https://bbycroft.net/llm

LLM from scratch

Bare metal LLM from scratch in Python

https://www.youtube.com/watch?v=kCc8FmEb1nY

LLM Research

Does an AI model think in English?

Great long Dutch article from Tweakers about research into how a language model "thinks." Does an AI model think in English? https://tweakers.net/reviews/13118/denkt-een-ai-model-in-het-engels-hoe-een-groot-taalmodel-van-binnen-werkt.html

Prompt engineering

That LLM geoguesser, how was the complex prompt for it developed and how does it work?

https://newsletter.angularventures.com/p/ai-s-geoguessr-genius-and-the-art-of-prompting-well

Misc

https://www.gptaiflow.tech/assets/files/2025-01-18-pdf-1-TechAI-Goolge-whitepaper_Prompt Engineering_v4-af36dcc7a49bb7269a58b1c9b89a8ae1.pdf

https://www.llama.com/docs/how-to-guides/prompting/