Solution to fivethirtyeight Riddler’s puzzle, Can You Win The Riddler Football Playoff? (December 09, 2022).

Speaking of “football,” the Riddler Football Playoff (RFP) consists of four teams. Each team is assigned a random real number between 0 and 1, representing the “quality” of the team. If team A has quality a and team B has quality b, then the probability that team A will defeat team B in a game is a/(a+b).2

In the semifinal games of the playoff, the team with the highest quality (the “1 seed”) plays the team with the lowest quality (the “4 seed”), while the other two teams play each other as well. The two teams that win their respective semifinal games then play each other in the final.

On average, what is the quality of the RFP champion?

Monte Carlo simulation

sim_trial <- function() {
  x <- sort(runif(4, 0, 1))
  a <- x[4]
  b <- x[3]
  c <- x[2]
  d <- x[1]

  if (runif(1) < a / (a + d)) {
    sf_1 <- a
  } else {
    sf_1 <- d
  }

  if (runif(1) < b / (b + c)) {
    sf_2 <- b
  } else {
    sf_2 <- c
  }

  if (runif(1) < sf_1 / (sf_1 + sf_2)) {
    return(sf_1)
  } else {
    return(sf_2)
  }
}

Running the simulation for a number of trials:

trials <- 1e5

d <- tibble(quality = replicate(trials, sim_trial()))

The average is a good measure of central tendency across all these trials. I will also calculate the standard error and 95% confidence intervals for the sample.

avg <- mean(d$quality)
se <- sd(d$quality) / sqrt(trials)
ci <- c(avg - 1.96 * se, avg + 1.96 * se)

Answer

The answer to the puzzle is 0.6726557 bounded by [0.6712392, 0.6740722].

Additional Notes

I get a negatively skewed distribution for simulation.

d1 <- d %>%
  mutate(diff_quality = c(0, diff(quality))) |>
  pivot_longer(c(quality, diff_quality))

d1 |> ggplot(aes(value)) +
  geom_histogram(binwidth = 0.01) +
  facet_wrap(~name, scales = "free") +
  geom_vline(
    data = subset(d1, name == "quality"), # Subset data for specific facet
    aes(xintercept = mean(avg)),
    col = "red",
    linetype = "dashed"
  ) +
  scale_x_continuous(breaks = breaks_pretty(n = 10))

The trials are not correlated, as expected.

A better way to capture locality is to use a loess smoother (white solid), and this hovers around the answer (red dashed).

d |>
  mutate(trial = seq_along(quality)) |>
  head(10000) |>
  ggplot(aes(trial, quality)) +
  geom_point() +
  geom_line() +
  geom_hline(yintercept = avg, col = "red", linetype = "dashed", size = 2) +
  geom_smooth(method = "loess", se = TRUE, col = "white") +
  scale_y_continuous(breaks = breaks_pretty(n = 10)) +
  labs(x = "Trial", y = "Quality", title = "Trend of Quality over 10K Trials")