4 "Probability" Posts

The Birthday Paradox in Production: When Random IDs Collide

You generate a UUID. It’s 128 bits total, with 122 bits of randomness. That’s 340 undecillion possible values. Collision-proof, right? Your system generates a million IDs per second. Still safe? What about a billion?

As I like to say, common sense and intuition are the enemies of science. Common sense tells you that with 340,000,000,000,000,000,000,000,000,000,000,000,000 possible values, you’d need to generate at least trillions before worrying about duplicates. Maybe fill 1% of the space? 10%?

Math shows us the uncomfortable truth: You’ll hit a 50% collision probability after generating just \(2.7 \times 10^{18}\) IDs. That’s 0.0000000000000000008% of your total space. At a billion IDs per second, you’ve got 86 years. Comfortable, but not infinite. Drop to 64-bit IDs? Now you’ve got 1.4 hours. Just enough time to duck out for long lunch and return to a disaster. And 32-bit? 77 microseconds. Faster than you can blink.

You might know that the birthday paradox proves that just 23 people have more than a 50% probability of sharing a birthday. What you may not know is that this isn’t just a party trick; it’s the same mathematics that determines when your “guaranteed unique” database IDs collide, why hash tables need careful sizing, and when your distributed system’s assumptions break.


“In a room of 23 people, there’s a greater than 50% chance two share a birthday. In your database, collisions arrive far sooner than intuition suggests.”


Read more →

How Large Language Models (LLMs) Read Code: Seeing Patterns Instead of Logic

Developers are accustomed to thinking about code in terms of syntax and semantics, the how and the why. Syntax defines what is legal; semantics defines what it means. A compiler enforces syntax with ruthless precision and interprets semantics through symbol tables and execution logic. But a Large Language Model (LLM), reads code the way a seasoned engineer reads poetry, recognizing rhythm, pattern, and context more than explicit rules.


“When an AI system ‘understands’ code, it is not executing logic; it is modeling probability.


Read more →

The Five-Second Rule Explored with Math & Python

You know the story: drop a cookie on the kitchen floor, swoop in before five seconds are up, and declare it safe. It is comforting. It is also wrong.


“Germs don’t wait five seconds. They start the party the instant your food hits the floor.”


The truth is much more interesting than the myth. Germs do transfer gradually, but they are especially fast at the beginning. That means if you want to know whether your floor-cookie is still edible, you need to think in curves, not in timers. And curves are something we can model.

Read more →

Should You Walk or Run in the Rain? The Puzzle That Sparked a Passion

To walk or to run. That is the question. Early in my programming career, I came across a coding challenge that stuck with me for many years: “If it’s raining, will you stay drier by walking or running through it?” At the time, I didn’t have the skillset or tools to simulate the problem properly. It became one of the first exercises that nudged me toward a lifelong fascination with modeling the real world through code.

Read more →