1 "Tokenization" Post

How Large Language Models (LLMs) Read Code: Seeing Patterns Instead of Logic

Oct 6, 2025 • Categories: AI & The Mathematics of Language, • Tags: #AI, #LLMs, #probability, #tokenization

Developers are accustomed to thinking about code in terms of syntax and semantics, the how and the why. Syntax defines what is legal; semantics defines what it means. A compiler enforces syntax with ruthless precision and interprets semantics through symbol tables and execution logic. But a Large Language Model (LLM), reads code the way a seasoned engineer reads poetry, recognizing rhythm, pattern, and context more than explicit rules.

“When an AI system ‘understands’ code, it is not executing logic; it is modeling probability.”

Tom Archer is a seasoned technical lead and engineering manager, as well as a software developer and technical author, with deep roots in Python, C++, Windows programming, and AI-driven automation. He has led teams and delivered software solutions for IBM, AT&T, Peachtree Software (Sage 500), and VeriSign. At Microsoft, Tom develops AI tools for technical documentation, drawing on decades of experience in both hands-on coding and guiding developer teams.

Tom also helped build and run CodeGuru.com , once the world’s largest Windows developer communities. He has been recognized by Microsoft as a Visual C++ MVP and has written ten programming books and hundreds of online articles.

While content in his current role at Microsoft, Tom believes there is always tomorrow to plan for. If you have a cool and interesting project, especially one that blends technology, creativity, and real-world impact on a fun team, he’s always excited to hear about it.

Connect with Tom today via LinkedIn .