The State of the USA | At the Plate, a Statistical Puzzler: Understanding Simpson's Paradox

DataBasics

At the Plate, a Statistical Puzzler: Understanding Simpson's Paradox

By Arthur Smith

August 20, 2010

In 2007 Red Sox rookie phenom Jacoby Ellsbury batted .353 while teammate Mike Lowell batted .324. In 2008, in what would be his first full season in the majors, Ellsbury again outperformed Lowell, batting .280 to Lowell's .274.

Red Sox Third Baseman Mike Lowell was outperformed at the plate by teammate Jacoby Ellsbury over a two-year stretch. Or was he?

(http://www.flickr.com/photos/keithallison/ / CC BY-SA 2.0)

So Ellsbury clearly outperformed Lowell at the plate over the two-year stretch, right?

Wrong. Over the course of the two years, Lowell was superior at the plate, out batting Ellsbury .304 to .293.

What at first glance may seem confusing comes down to a simple problem of aggregation -- and a classic example of the statistical phenomenon known as Simpson's Paradox.

Simpson's Paradox occurs when a relationship between two variables -- in this case,the batting averages -- is reversed when an additional variable is taken into account. The additional variable to consider here is the number of at-bats each player had in each season. Not doing so results in "omitted variable bias," or a change in how the relationship between the two batting averages is understood.

And it matters -- not only for the important business of sizing up your favorite baseball players, but also for such pursuits as assessing the nation's schools or understanding state-by-state obesity rates.

How it Works

You weigh the importance of additional factors, or variables, implicitly in almost every evaluation you make. Your friend might tell you that his team is better because it has a better record, to which you might counter,"OK, but my team plays a harder schedule." You understand intuitively that an accurate assessment of the relationship between the teams cannot be made by assessing the record alone. Therefore, when including, or conditioning your argument on a third variable (beyond just wins and losses), you can get a picture that may change what initially had been thought.

Simpson's Paradox goes one step further. It says not only does omitting an important variable change a relationship, but in fact, it can completely reverse how the relationship is perceived. Ellsbury may seem to have had the hotter bat across the two years, but the facts show Lowell actually had a better two-year average. Here's how it works:

Year	2007	2008	2007 and 2008
Jacoby Ellsbury	41/116 (.353)	155/554 (.280)	196/670 (.293)
Mike Lowell	191/589 (.324)	115/419 (.274)	306/1008 (.304)

Yearly stats from ESPN.com

Yes, Ellsbury had the better average both seasons. But he had far fewer at bats than Lowell in 2007, having joined the team late in the season.

Ellsbury had only 116 at bats in 2007, while Lowell had 589. Therefore, for Ellsbury's combined average, the second year is weighted much more heavily, while Lowell's average in his better season - 2007 -- carries greater weight than his average in his worse season. The result is Lowell's superior average on aggregate, a Simpson's Paradox.

Beyond Baseball

Simpson's Paradox actually occurs with some frequency, and not just in baseball. Every year students across the country take the National Assessment of Educational Progress exams, with results mapped to a variety of factors, including whether students are eligible for the national school lunch program.

A comparison between school-lunch eligible eighth-graders in New York City and California for 2007 finds that a lower percentage of the New Yorkers scored below basic in math. The same was true for New Yorkers not eligible for school lunch compared to Californians not eligible.

However, in aggregate, fewer California eighth-graders were below basic in math than were NYC eighth graders. Why? A significantly higher percentage of NYC students are eligible for the school lunch program:

Jurisdiction	School Lunch Eligible	School Lunch Ineligible	Combined
New York City (% Below Basic)	45.9	17.2	42.5
California (% Below Basic)	54.2	27.8	40.9

2007 NAEP math

In Health, at least four examples of Simpson's Paradox are found in state-level obesity data among black, white and Hispanic adults reporting a Body Mass Index over 30:

State	Black Adults	White Adults	Hispanic Adults	All Adults
Mississippi	40	28	26	32
Alabama	42	29	28	31

Louisiana	36	25	20	28
Kansas	39	26	29	27

Louisiana	36	25	20	28
Iowa	38	26	25	26

District of Columbia	35	10	17	24
Massachusetts	36	20	29	21

BRFSS 2006-2008

Deciding What's Relevant

One of the most important aspects of reading these or any such statistics is deciding which ones matter.

Simpson's Paradox might be used to demonstrate that, in fact, New York City's schools were outperforming California's insofar as New York City has a smaller percentage of children performing below basic in both categories of lunch eligibility. However, one might also use it to try to demonstrate that, contrary to common interpretation of baseball statistics, Lowell actually had a better two years than Ellsbury.

Recognizing Simpson's Paradox, and omitted variable bias in general, has very important policy implications. In the case of New York's schools, one could argue that rather than investing more in education, you need to address the underlying social conditions that are leaving such a high percentage of students eligible for the national school lunch program. Suddenly a policy question about education can become a policy question about poverty when looking at the system as a whole.

It is an important aspect of research to determine what other variables matter for a question and which ones don't, and there is yet another layer for policy in determining which ones can be most effectively addressed to produce desired outcomes. Not only is it important to be aware of the pitfalls of Simpson's Paradox, but it is also necessary to determine whether the paradox is relevant to the question being asked. Simpson's Paradox, then, serves as a pertinent reminder that there is often more to a statistic than meets the eye.

Arthur Smith is a recent graduate of Georgetown University, with a degree in international economics. He spent his junior year studying at the London School of Economics. At the State of the USA, he has worked on education and economy projects, includng, the data selection, collection and presentation processes. He has also worked as a research assistant on projects analyzing returns on education, and perceptions of AIDS in Kenya.

Posted by Arthur Smith at 2:43 PM on August 20, 2010