Simpson’s Paradox: A Maximum Likelihood Solution


Kläy, M., & Wesley, L. P. (1991). Simpson’s Paradox: A Maximum Likelihood Solution. SRI International.


Simpson’s paradox exemplifies a class of problems that can arise when the logic used to reason about the semantics of propositional sentences does not adequately capture certain dependencies between sentences of interest. This paradox has been known as early as 1903 [YUL03], and has been discussed extensively in the statistical literature [SIM51, DAW79, BLY73, CHU42]. The phenomena that typically give rise to Simpson’s paradox can occur in cases such as destructive testing (e.g., determining the breaking strength of materials in orthogonal directions), and identifying the composition of complex alloys. It has also been reported to occur in “real-life” several times since its discovery [KNA85, WAG82]. One such occurrence received wide attention in 1973 over the appearance of a sex bias in the admission policy for graduate students at the University of Berkeley [BIC75]. Given that automated systems will be expected to recognize and cope with the underlying phenomena of this paradox, it is important to develop effective methods for dealing with them, particularly as it impacts the choice of logics that systems must use to reason about real world problems. Only recently, however, has there been any significant indication that Simpson’s paradox merits serious attention by the AI community [PEA88].

Read more from SRI