Why does 23 people give 50% odds?
The paradox feels like a paradox because the question sounds like a different question. When someone asks "what are the odds two of us share a birthday?" the mind reaches for "what are the odds someone shares mine?" — and that really is a small number. The classic problem isn't asking that.
The classic problem counts pairs, not people. A room of 23 contains C(23, 2) = 253 possible pairs, and each pair is a fresh chance to match. Once you reframe the question as "is any one of those 253 pairs a match?" the threshold at 23 becomes intuitive: enough pairs, even with 365 possible dates, that a hit is more likely than a miss.
The right-hand panel is the "matches you" version, the one the intuition mishears. You'd need about 253 other people for a 50% chance one of them shares your specific date. Both 253s come from the same combinatorics — the binomial coefficient is the engine in both directions, just attached to different events.
The math, step by step
The cleanest path is to compute the opposite event — the probability that no two people share a birthday — and subtract from one. Four steps:
1. Why use the complement
"At least two share" is a union of overlapping pair-events: persons 1 and 2 match, persons 1 and 3 match, persons 2 and 3 match, and so on. Adding these probabilities directly double-counts the cases where more than one pair matches. "No two share" is a single, clean event — much easier to compute, and its complement is what we want.
2. Product form for "no shared"
Imagine adding people to the room one at a time. Person 1 takes some date; person 2 has 364 of 365 acceptable dates; person 3 has 363 of 365; the k-th person has 365 − k + 1 of 365 acceptable dates:
P(no shared) = (365/365) × (364/365) × (363/365) × … × ((365 − n + 1) / 365)
3. Rewrite as factorials
The numerator is a falling factorial: 365 × 364 × … × (365 − n + 1) = 365! / (365 − n)!. The denominator is 365 × 365 × … × 365 = 365ⁿ. Combine:
P(no shared) = 365! / [(365 − n)! · 365n]
P(at least two share) = 1 − 365! / [(365 − n)! · 365n]
4. Worked at n = 5
P(no shared) = (365 × 364 × 363 × 362 × 361) / 365⁵. Numerator = 6,303,659,184,720. Denominator = 365⁵ = 6,478,348,728,125. Ratio ≈ 0.9729, so P(at least two share, n = 5) ≈ 2.71% — the landmark figure for a 5-person room.
The same probability ladder is in the capsule above and the chart in the tool. The landmarks are n = 23 (first past 50%, 50.73%), n = 41 (first past 90%, 90.32%), n = 57 (first past 99%, 99.012%), and n = 70 (first past 99.9%, 99.916%). Past about n = 80 the curve flattens against the ceiling.
The full probability table, n = 1 to 100
Every value computed at request time from the formula above — no hard-coded ladder. Threshold rows (n = 23, 41, 57, 70) are bolded.
| n | P(shared) | n | P(shared) | n | P(shared) | n | P(shared) | n | P(shared) |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 0.000% | 21 | 44.37% | 41 | 90.32% | 61 | 99.509% | 81 | 99.99331% |
| 2 | 0.274% | 22 | 47.57% | 42 | 91.40% | 62 | 99.591% | 82 | 99.99479% |
| 3 | 0.820% | 23 | 50.73% | 43 | 92.39% | 63 | 99.660% | 83 | 99.99596% |
| 4 | 1.64% | 24 | 53.83% | 44 | 93.29% | 64 | 99.719% | 84 | 99.99688% |
| 5 | 2.71% | 25 | 56.87% | 45 | 94.10% | 65 | 99.768% | 85 | 99.99759% |
| 6 | 4.05% | 26 | 59.82% | 46 | 94.83% | 66 | 99.810% | 86 | 99.99815% |
| 7 | 5.62% | 27 | 62.69% | 47 | 95.48% | 67 | 99.844% | 87 | 99.99859% |
| 8 | 7.43% | 28 | 65.45% | 48 | 96.06% | 68 | 99.873% | 88 | 99.99892% |
| 9 | 9.46% | 29 | 68.10% | 49 | 96.58% | 69 | 99.896% | 89 | 99.99918% |
| 10 | 11.69% | 30 | 70.63% | 50 | 97.04% | 70 | 99.916% | 90 | 99.99938% |
| 11 | 14.11% | 31 | 73.05% | 51 | 97.44% | 71 | 99.932% | 91 | 99.99953% |
| 12 | 16.70% | 32 | 75.33% | 52 | 97.80% | 72 | 99.945% | 92 | 99.99965% |
| 13 | 19.44% | 33 | 77.50% | 53 | 98.11% | 73 | 99.956% | 93 | 99.99973% |
| 14 | 22.31% | 34 | 79.53% | 54 | 98.39% | 74 | 99.965% | 94 | 99.99980% |
| 15 | 25.29% | 35 | 81.44% | 55 | 98.63% | 75 | 99.972% | 95 | 99.99985% |
| 16 | 28.36% | 36 | 83.22% | 56 | 98.83% | 76 | 99.978% | 96 | 99.99989% |
| 17 | 31.50% | 37 | 84.87% | 57 | 99.012% | 77 | 99.982% | 97 | 99.99992% |
| 18 | 34.69% | 38 | 86.41% | 58 | 99.166% | 78 | 99.986% | 98 | 99.99994% |
| 19 | 37.91% | 39 | 87.82% | 59 | 99.299% | 79 | 99.989% | 99 | 99.99995% |
| 20 | 41.14% | 40 | 89.12% | 60 | 99.412% | 80 | 99.99143% | 100 | 99.99996% |
Five columns × twenty rows = all 100 group sizes. Probabilities shown to the precision that distinguishes them: roughly 2 to 3 decimals through n ≈ 50, then 3 decimals near the 99% ceiling, and a truncated five-decimal form near n = 100 (where naively rounding would tip "100%" — the value is 99.99996%, not 100%).
Does the real world match?
The textbook calculation assumes 365 equally likely birth dates. Real U.S. births aren't quite uniform — September is a peak month and Christmas Day a dip, as the Birthday Rarity Calculator page documents in full. Does that clustering change the answer?
Slightly, and in one direction only. Using the actual U.S. date frequencies (1994–2014, FiveThirtyEight rarity dataset; February 29 set aside and the remaining 365 dates renormalised), the probability at n = 23 is ≈50.9% (200,000-trial seeded Monte Carlo) — about +0.17 percentage points higher than the 50.73% textbook figure.
That's not a fluke of one sample. A standard result in probability says a uniform distribution minimises the collision probability over any alphabet of the same size — Schur-convexity of the per-pair match rate, with a clean proof — so any real-world clustering can only push the figure up, never down. A September peak and a Christmas dip both make matching pairs slightly more likely than the textbook count, never less. The lift is small here because U.S. births are close to uniform; in a population with sharper clustering, the lift would be larger.
Variations on the problem
The classic problem is one event in a family. Two natural variants change the threshold dramatically — one upward (requiring three matches instead of two pushes it past 80), one downward (relaxing to "within a day" pulls it below 15). Both are computed in-house and dual-path verified at build time against the closed-form derivation.
At least three people share a birthday
The closed form sums over the number of dates that host an exact pair. With i such double-dates and n − 2i remaining singleton-dates among 365:
P(no triple) = Σi=0..⌊n/2⌋ [ C(365, i) · C(365−i, n−2i) · n! / 2i ] / 365n
| n people | P(at least 3 share) |
|---|---|
| 30 | 2.85% |
| 50 | 12.64% |
| 88 | 51.11% (first past 50%) |
| 100 | 64.59% |
The triple version needs about 3.8× more people than the pair version (88 vs 23) because the event requires a coincidence of coincidences — three matches against the same date, not just any pair. The threshold matches the published value in the Wikipedia "Birthday problem" extensions article.
Birthdays within one day of each other (near-match)
Variant definition: "within one calendar day, non-circular, 365-day year." Two people count as a near-match if their birthdays differ by 0 or 1 day, where days are integers from 1 to 365 on a linear calendar (Jan 1 and Dec 31 are NOT treated as adjacent). Other variants in the literature (within zero, circular calendar, 366 days) give different numbers — this page commits to one and labels it.
Counting ordered n-tuples of days with all pairwise gaps ≥ 2 gives n! · C(365 − n + 1, n) configurations, which simplifies to:
P(no two within 1 day) = (365 − n + 1)! / [(365 − 2n + 1)! · 365n]
| n people | P(any pair within 1 day) |
|---|---|
| 5 | 7.96% |
| 10 | 31.42% |
| 14 | 53.68% (first past 50%) |
| 20 | 80.39% |
Relaxing "exact match" to "within one day" collapses the threshold from 23 to 14 — each person effectively claims three calendar days instead of one, so the room fills up faster. Cited threshold: Wikipedia "Birthday problem — Near matches", originally Abramson and Moser (1970).
Where it matters: the birthday attack
The combinatorics shows up in cryptography under the name birthday attack. For a hash function with an n-bit output (so 2n possible values), a collision is expected after roughly 2n/2 attempts — the square root of the output space, not the full space.[1] The mechanism is the same one as the room of 23: every pair of hashes you've computed is a fresh chance to match, and C(k, 2) grows like k² / 2.
| Hash | Output bits | Naive search | Birthday bound |
|---|---|---|---|
| MD5 | 128 | ~2128 | ~264 |
| SHA-1 | 160 | ~2160 | ~280 |
| SHA-256 | 256 | ~2256 | ~2128 |
The real-world hook — SHAttered (2017). A team at Google and CWI Amsterdam (Marc Stevens, Elie Bursztein, Pierre Karpman, Ange Albertini, Yarik Markov) published the first practical SHA-1 collision: two different PDF files that produce identical SHA-1 hashes. Reported attack cost: ~2^63.1, as the SHAttered team reported. The result ended SHA-1 for security use — browsers dropped SHA-1 certificate support, Git issued warnings on SHA-1 collisions, and TLS deprecated it.
Theoretical birthday bound is not "broken in practice"
The birthday bound is a floor on generic-attack cost — it assumes the attacker can only hash random inputs and look for collisions. Real hash functions have algebraic structure that cryptanalysts exploit, and the practical collision cost can be much lower than the birthday bound suggests:
- MD5 — birthday bound ~264. Chosen-prefix cryptanalytic collisions now found in seconds to minutes on commodity hardware — far below the 2^64 birthday floor.
- SHA-1 — birthday bound ~280. SHAttered (2017) found a collision at ~2^63.1, as the SHAttered team reported — well below the 2^80 birthday floor — using chosen-prefix cryptanalysis. Follow-up work (Leurent–Peyrin, 2020) brought the cost lower still.
- SHA-256 — birthday bound ~2128. No known practical attack beats the 2^128 birthday floor. Still considered secure for collision resistance.
The page's framing: distinguish the theoretical birthday bound (the easiest possible generic attack, the floor) from broken in practice (a specific cryptanalytic weakness in a specific hash that does much better than the birthday bound). Both numbers matter, and they answer different questions.
Standard references: the NIST Computer Security Resource Center glossary entry for "birthday attack", the Wikipedia "Birthday attack" article, and the SHAttered announcement (shattered.io).
[1] We use the standard 2n/2 approximation; a more exact treatment gives roughly 1.25× that, which doesn't change the security picture.
How many people guarantee a match?
367. The calendar has 366 possible dates (including February 29). The pigeonhole principle — n + 1 pigeons in n holes — says any group of 367 people must contain at least two who share a date. At 366 a perfect non-collision is still mathematically possible (everyone gets their own unique calendar slot); at 367 it isn't.
In practice the curve hits "near-certainty" much earlier than the guarantee — 99.9% by 70 people — which is the more practically useful number for, say, room-sizing intuitions.