Attention Mechanisms

The Discrete Mathematics Hiding Inside LLMs

Mar 31, 2026 • Category: AI and the Mathematics of Language, • Tags: #AI, #attention mechanisms, #embeddings, #discrete mathematics, #LLM, #predicate logic, #set theory

A recent LinkedIn post from Michael Palmer described how discrete mathematics is the foundation for how computers reason about problems. That thread got me thinking about just how many discrete math concepts show up inside systems that seem purely statistical. LLMs are often described in terms of neural networks, gradient descent, and probability distributions. If you’ve taken discrete mathematics and wondered what it has to do with modern AI, the answer is: more than you’d expect.

Underneath the calculus and linear algebra, the same structures you learn in a discrete math course keep appearing: sets, predicate logic, Boolean operations, modular arithmetic, formal proof patterns. This post traces those connections.

“Attention heads act like soft predicates over tokens. Masks are set operations. Chain-of-thought resembles proof structure.”

How Large Language Models (LLMs) Handle Context Windows: The Memory That Isn't Memory

Nov 10, 2025 • Category: AI and the Mathematics of Language, • Tags: #AI, #attention mechanisms, #context windows, #LLM, #transformers

When you have a long conversation with a large language model (LLM) such as ChatGPT or Claude , it feels like the model remembers everything you’ve discussed. It references earlier points, maintains consistent context, and seems to “know” what you talked about pages ago.

But here’s the uncomfortable truth: the model doesn’t remember anything. It’s not storing your conversation in memory the way a database would. Instead, it’s rereading the entire conversation from the beginning every single time you send a message.

“A context window isn’t memory. It’s a performance where the model rereads its lines before every response.”