Blog/Data Science
Data ScienceMay 16, 2026

We Recomputed Our IMDb Correlation Three Ways. At the Show Level, It's Negative.

Last month we ran a quick correlation of episode-level IMDb ratings against the Humor Index across three shows and reported r = -0.005 pooled. The headline at the time was "audience ratings and comedy craft are essentially unrelated." Comments asked what that meant. We dug back in across all eight scored shows now (1,105 episodes), and the three-level answer is much more interesting than the one-line headline.

Three findings, going in the same direction.

Finding 1: Top-10 episode overlap is at chance

For each show, take HI's top 10 episodes and IMDb's top 10 episodes. Count how many appear in both lists.

ShowHI top-10 ∩ IMDb top-10HI top-20 ∩ IMDb top-20
The Office0 / 104 / 20
Seinfeld0 / 101 / 20
Friends2 / 103 / 20
Parks and Recreation1 / 105 / 20
Arrested Development2 / 107 / 20
Schitt's Creek1 / 102 / 20
30 Rock0 / 103 / 20
Taxi0 / 105 / 20
Pooled6 / 80 (7.5%)30 / 160 (18.8%)

A random pick of 10 episodes out of ~100 would overlap with another random 10-pick at roughly 10%. HI and IMDb agree on "the best episodes" at chance level. For Seinfeld and Taxi specifically, the agreement is so low it's actively striking — one or zero overlapping titles in their top 10.

Arrested Development has the strongest agreement (2/10, 7/20), consistent with its also being the only show with a moderate per-episode correlation. We'll come back to AD.

Finding 2: Per-show correlations are a mixed-sign cluster

ShowNHI meanIMDb meanPearson rSpearman ρ
Arrested Development8481.928.02+0.392+0.379
Parks and Recreation9878.327.99+0.201+0.184
Taxi11477.527.49+0.182+0.176
The Office18678.208.08+0.158+0.172
30 Rock13884.937.93+0.007+0.022
Friends23573.148.29-0.013-0.059
Seinfeld17077.258.25-0.058-0.004
Schitt's Creek8077.937.97-0.115-0.077

The range is -0.115 to +0.392. Five shows are positive, three are negative. The weighted mean of within-show correlations is about +0.07. Even within a single show — where everything except the individual episode is held constant — the AI and the audience are measuring almost-uncorrelated things, and which one leads can flip sign show-to-show.

Why does AD have such a strong positive while Schitt's Creek has a negative? We're not entirely sure, but a working hypothesis: AD is unusually joke-dense and the IMDb voters self-selected for AD fans who weight craft heavily. Schitt's Creek is more emotion-driven, and IMDb voters reward late-series finales and emotional reveals that the joke-density rubric doesn't see. That hypothesis is testable; we'll come back to it in a future post.

Finding 3: At the show level, the correlation is *negative*

This is the strangest finding and probably the headline.

Plot each show as one point: mean HI on the x-axis, mean IMDb on the y-axis. Across the 8 shows, r = -0.287 (rho = -0.476).

The shows the AI rates funniest tend to be the shows IMDb audiences rate lowest on average. 30 Rock is the cleanest example: highest HI in the dataset (84.9), middling IMDb (7.93). Friends is the cleanest counter: lowest HI (73.1), highest mean IMDb (8.29). Seinfeld and Friends sit at the top of the IMDb axis while sitting in the middle of the HI axis; 30 Rock and Arrested Development sit at the top of the HI axis while sitting in the middle of the IMDb axis.

This is consistent with a known divide in comedy: audience love is built on relationship and emotional payoff, which is a different thing from per-joke craft. The AI is measuring per-joke craft. IMDb is measuring how-much-this-show-meant-to-me. Across our catalog, those two things are mildly anti-correlated.

The disagreement is structured

The disagreement between the two rating systems isn't random — it's interpretable. Here are the 10 episodes where HI rates much higher than IMDb (within-show z-score gap):

ShowSETitleHIIMDb
Seinfeld614The Highlights Of 100 (1)91.16.9
Seinfeld14Male Unbonding91.67.2
Friends721The One With The Vows83.07.2
Friends421The One With The Invitation77.96.9
The Office821Angry Andy89.56.7
The Office413Dinner Party98.07.6
Seinfeld11The Seinfeld Chronicles86.47.3
30 Rock11Pilot90.17.2
The Office917The Farm91.97.3
Schitt's Creek13Don't Worry, It's His Sister88.67.5

Pattern: clip shows, pilots, recap episodes, and episodes that fans actively disliked. The AI is reading them as joke-dense without seeing that the audience was already over them. Note "Dinner Party" appearing here even at a 7.6 IMDb — A.V. Club gave it an A, HI agrees (98.0), but its IMDb is "only" 7.6 because many viewers find it too painful to enjoy.

Now the reverse — the 10 episodes where IMDb rates much higher than HI:

ShowSETitleHIIMDb
Friends625The One With The Proposal (2)59.09.2
Schitt's Creek612The Pitch66.28.7
Taxi320Latka The Playboy67.28.5
Seinfeld98The Betrayal62.98.9
30 Rock54Live Show77.98.6
Schitt's Creek613Start Spreading The News73.29.2
30 Rock112Black Tie77.08.5
30 Rock614Kidnapped By Danger74.58.3
Seinfeld521The Hamptons73.49.5
30 Rock712Hogcock!83.18.9

Pattern: series finales, emotional payoff episodes, format-experiment episodes, and stunt-character showcases. "Start Spreading The News" is the Schitt's Creek finale. "The One With The Proposal" is the Chandler-Monica proposal. "The Betrayal" is Seinfeld's reverse-chronology episode. "Live Show" is 30 Rock's live-broadcast experiment. "Hogcock!" is 30 Rock's series finale.

The audience is rewarding what the AI cannot see: ending, weight, format risk, and the cumulative emotional debt of a long-running show paying off. These are events, not jokes. HI scores them as merely competent on craft because they're not joke-dense; IMDb scores them as transcendent because of what they represent.

What this means for what we publish

The site has been positioned around the claim that HI measures "comedy craft, not popularity." This analysis gives that claim empirical backbone instead of leaving it as a hedge:

  • At the show level, HI is anti-correlated with audience popularity. Funnier-per-craft and more-loved are not the same thing.
  • At the episode level within a show, the two axes are at chance to weakly positive — and the disagreement is structured: audiences reward emotional climax; HI rewards joke density.
  • Our earlier r = -0.005 pooled result understated this. It averaged opposing show-level and episode-level effects to a single deceptively quiet number.

The right way to read the Humor Index is now this: it measures what a writers' room would mean by "this episode is well-crafted comedy." It does not measure what an audience means by "this is my favorite episode." Those two things diverge in interpretable ways, and we can show you exactly where.

Methodology caveats

  • IMDb ratings drift over time. This snapshot is whatever was in our episode JSONs at scoring time; ratings on long-running shows like Friends and Seinfeld have probably shifted slightly upward since their early days.
  • IMDb's per-episode rating mixes signal sources. Pilots and finales get massively more votes than mid-season filler. That sample-size imbalance is real but doesn't change the direction of the findings.
  • The pre-1985 era is represented by only one show (Taxi). The show-level negative correlation could partly reflect era effects we haven't measured yet. As we score Mary Tyler Moore, All in the Family, MAS*H, and Barney Miller, this comparison will get more robust.
  • This is one external source. Audience ratings on IMDb are a signal of episode quality, not the truth about it. A complementary critic-side validation (A.V. Club letter grades) is coming next. Both together is the right calibration.

Data

If you want to pull the underlying joined dataset — every episode with its HI, IMDb rating, and within-show z-scores — it's available on request. Email hello@thehumorindex.com and we'll send it. The full per-show breakdown is also visible on each show's page on the site.

---

This post extends the earlier [IMDb vs Humor Index](/blog/imdb-vs-humor-index) analysis from April with the full eight-show dataset. The methodology is documented at [our methodology page](/methodology). Questions or pushback: hello@thehumorindex.com.

Liked this analysis?

We publish one deep dive every week.

Join comedy fans getting weekly rankings, new show analyses, and the funniest moments we found. No spam, unsubscribe anytime.

Explore the rankings

See the full per-episode breakdown of the highest-ranked sitcoms on the Humor Index.

See every show ranked →