Solution to fivethirtyeight Riddler’s puzzle, Shared Birthdays (December 9, 2022).
Suppose people walk into a room, one at a time. Their birthdays happen to be randomly distributed throughout the 365 days of the year (and no one was born on a leap day). The moment two people in the room have the same birthday, no more people enter the room and everyone inside celebrates by eating cake, regardless of whether that common birthday happens to be today.
On average, what is the expected number of people in the room when they eat cake?
Extra credit: Suppose everyone eats cake the moment three people in the room have the same birthday. On average, what is this expected number of people?
import random
random.seed(123)
def simulate_data(trials, days, rep_birthdays):
lengths = []
for trial in range(1, trials+1):
x = []
for k in range(1, days + 2):
z = random.sample(range(1, days+1), 1)[0]
if x.count(z) == rep_birthdays - 1:
lengths.append(k)
break
else:
x.append(z)
return lengths
Running the simulation for a number of trials:
trials = int(1e5)
n_people = simulate_data(trials, 365, 2)
We can now calculate the expectation (mean) and the 95% confidence intervals for the sample.
import numpy as np
avg = np.mean(n_people)
se = np.std(n_people) / np.sqrt(trials)
ci = (avg - 1.96 * se, avg + 1.96 * se)
ci = [round(num, 4) for num in ci]
ci = "[" + ", ".join(map(str, ci)) + "]"
The answer to the puzzle is 24.7021 with a bound of [24.6266, 24.7775].
Extra credit:
trials = int(1e4)
select_matches = list(range(2, 10))
results = [np.mean(simulate_data(trials, 365, x)) for x in select_matches]
print("answer to extra credit: ", results[1])
## answer to extra credit: 88.6022
I am curious to see how this value grows as we add more “matches”.
results <- py$results
enframe(results, name = "select_matches", value = "results") |>
unnest() |>
ggplot(aes(select_matches, results)) +
geom_point() +
scale_x_continuous(n.breaks = 10) +
geom_hline(yintercept = 365, col = "red", linetype = "dashed") +
labs(
x = "Number of People With Same Birthday",
y = "Number of People Eating Cake"
)