1 "Attention Mechanisms" Post

How Large Language Models (LLMs) Handle Context Windows: The Memory That Isn't Memory

Nov 10, 2025 • Category: AI and the Mathematics of Language, • Tags: #AI, #attention mechanisms, #context windows, #LLM, #transformers

When you have a long conversation with a large language model (LLM) such as ChatGPT or Claude , it feels like the model remembers everything you’ve discussed. It references earlier points, maintains consistent context, and seems to “know” what you talked about pages ago.

But here’s the uncomfortable truth: the model doesn’t remember anything. It’s not storing your conversation in memory the way a database would. Instead, it’s rereading the entire conversation from the beginning every single time you send a message.

“A context window isn’t memory. It’s a performance where the model rereads its lines before every response.”

Tom Archer is a seasoned technical lead and engineering manager, as well as a software developer and technical author, with deep roots in Python, C++, Windows programming, and AI-driven automation. He has led teams and delivered software solutions for IBM, AT&T, Peachtree Software (Sage 500), and VeriSign. At Microsoft, Tom develops AI tools for technical documentation, drawing on decades of experience in both hands-on coding and guiding developer teams.

Tom also helped build and run CodeGuru.com , once the world’s largest Windows developer community. He has been recognized by Microsoft as a Visual C++ MVP and has written ten programming books and hundreds of online articles.

While content in his current role at Microsoft, Tom believes there is always tomorrow to plan for. If you have a cool and interesting project, especially one that blends technology, creativity, and real-world impact on a fun team, he’s always excited to hear about it.

Connect with Tom today via LinkedIn .