A Review of the Bell Curve: Bad Science Makes for Bad Conclusions
by William J. Matthews, Ph.D.
Herrnstein and Murray’s Bell Curve (1994) obtained a fair amount of publicity upon its release. It has been seized upon by those with a predisposition to belief that heredity is the central and determining factor in explaining race and class differences. As Stephen Jay Gould points out in the November, 1994 New Yorker (based on his book The Mismeasure of Man) that “the Bell Curve holds no new arguments or compelling data but cashes in on the depressing temper of our time.” I would suggest the E. Miller’s (1997) review and his extension of the arguments offered in the Bell Curve reflect quite deeply this “depressing temper of our time”.
Discussing social policy prematurely
Miller, a professor or finance and economics, evidences no understanding of the logical and statistical flaws, or questionable assumptions upon which the Bell Curve is based and yet is willing to engage in a discussion of eugenics etc. I think it premature to discuss such social policy in the absence of a more careful consideration of the assumptions driving such social policy.
Rather than critique Miller’s deeply flawed review, I will direct my criticism towards a few of the main flaws in the Bell Curve itself and invite readers to review the primary source material themselves. Herrnstein and Murray’s assessment of race and class differences, while not inherently illogical, rest on 4 very questionable premises which they simply do not discuss much less defend.
They assume that intelligence must be:
(1) depictable as a single number;
(2) capable of rank ordering people in a linear order;
(3) primarily genetically based; and
(4) essentially immutable.
If any of these premises are false then their entire argument disintegrates (Gould, 1994). Thus even if one were to accept premises 1-3 as true but 4 is false then programs, contrary to Herrnstein and Murray’s view, designed to boost IQ would be quite reasonable. Much in the way that a genetically based hearing defect could be altered by a hearing aid.
Generalizing within a group
All researchers in this area (including Herrnstein and Murray) acknowledge the problems of generalizing from within group differences in intelligence (i.e. within a white population) to between groups differences (e.g., differences between whites and blacks). These authors do it anyway.
Let us consider body height (a much more inheritable trait than IQ). As Gould suggests, there is no question but that the average height of Indian males from a nutritionally deprived village would significantly increase in a few generations with improved nutrition. By analogy, the well documented 15 point IQ difference between American whites and blacks permits no automatic conclusion that truly equal opportunities for blacks would not equal or surpass the white IQ average.
Herrnstein and Murray acknowledge the empirical fact that one can not prejudge any one black person because so many blacks score higher than the average white IQ score. However, these authors down play the strong circumstantial evidence for the malleability, as opposed to immutability, of IQ scores, e.g. the IQ scores of poor black children adopted into affluent and intellectual homes, and the well documented observation that IQ scores have steadily risen (in the US and other technologically advanced countries) at a rate of 3 points per decade since 1940 (i.e. more than a full standard deviation). This is referred to as the “Flynn effect’ and is not well understood.
Existence of g
Is the existence of g upon which IQ is based a given reality? Herrnstein and Murray do not even attempt to justify their assumption that Spearman’s g has construct validity, i.e. measures what it purports to measure. They simply state that the assumption of the existence g and that it is accepted by experts in the field. By experts, they mean psychometricians who themselves raise serious criticism of g (cf. Neisser et al.,1996).
However there is general agreement among psychometricans in the belief that intelligence test scores taken alone ignores other important aspects of mental ability. There are other quite legitimate notions of intelligence for which there is empirical research.
Howard Gardner proposes the notion of ‘multiple intelligences’, while Robert Sternberg suggests a triarchic theory of intelligence (i.e. analytic, creative, practical). In a different line of inquiry on intelligence, Piaget proposed the notion of a developmentally based intelligence. While Russian psychologist, Lev Vygotsky, argues in favor of a socially developed intelligence.
From a biological perspective, some researchers have suggested brain anatomy and physiology as relevant factors in intelligence. Psychometrics, while the oldest of these approaches, is not the only approach to intelligence and of course has its limitations.
With regards to this notion of the validity of the Spearman g, Herrnstein and Murray base their 800 page book on the reality of IQ’s g and essentially do not discuss the theoretical basis for their claim. Readers interested in the issue of intelligence, should familiarize themselves with the issue of g and factor analysis raised by L.L. Thrustone in the 1930’s.
Simply stated, Spearman used factor analysis to find a common factor among positive correlation in various mental tests. As any first year graduate student knows, such positive correlation’s would be expected but say nothing about causality. Thurstone later demonstrated with factor analysis that g could be made to disappear by simply rotating the dimensions to different positions (in some instances creating the notion of separate and multiple intelligences).
Even in the absence of a deep understanding of factor analysis, it should be immediately obvious that g can not be an inherent reality (as assumed by Herrnstein and Murray) as it can be made to emerge in one mathematical formulation and disappears or is greatly reduced in an other mathematical formulation.
One should note that each mathematical formulation is no less legitimate than the other. Herrnstein and Murray do not discuss this central point. They simply accept the reality of g without question or discussion.
I would hypothesize that they simply seek to fit the data to their theory rather than adjust their theory according to the data.
Bias in IQ tests
What about bias in the IQ test? Herrnstein and Murray properly discuss the issue of “statistical bias’ (S-bias) which simply asks does the IQ test score reliably predict performance on criterion variables (school achievement, college GPA, etc.). The IQ score does reliably predict performance on these criterion variables and therefore one can conclude there is not statistical bias, in this case, against blacks.
This says nothing, however, about cultural bias which is often confused with ‘S’ bias. A discussion in which Herrnstein and Murray do not engage. Fischer et al. (1996) note that Bell Curve analysis is based on the Armed Forces Qualifying Test (AFQT) which is not an IQ test but designed to predict performance of certain criterion variables. The math section requires high school algebra.
Furthermore, they note that the original plot of the AFQT data is not in the shape of the required bell curve. Since Herrnstein and Murray require a bell curve for their theory, they reshaped the original data to fit their theory. Here we have an example of theory driving the data.
Even with these problems, Fischer et al.,(1996) for the sake of argument, accept Herrnstein and Murray’s evidence, measure of intelligence, and basic methodology and then reexamine the results. However, they correct for factors (ignored by Herrnstein and Murray) known to have significant effects on a person’s life outcome (e.g., parental income, number of siblings, local unemployment rate, geographic region).
In their reanalysis, Fischer et al. conclude that a person’s life chances depend on their social surroundings at least as much as their own intelligence. They conclude that the key finding of the Bell Curve (i.e.,IQ as a predictor of SES) is an artifact of its own method.
Fischer et al. go on to further analyze the effects of poverty. For example, despite the fact that IQ scores between men and women are about equal, women are much more likely to be poor. Fischer et al. observe that based on the Herrnstein and Murray AFQT data that a “young woman would have to score 41 points higher than her equivalently matched male counterpart (schooling etc.) to have the same risk for being poor as he.”
Herrnstein and Murray do not discuss gender in any great depth. I am not suggesting that therefore SES is the major or only determinant of IQ but that Herrnstein and Murray have failed to adequately refute or even deal fairly with this hypothesis.
Both Fischer et al. as well as Neisser et al. (1996) discuss the issue of caste-like minority status within society. To be born into a caste-like minority is to grow up firmly convinced that one’s life will be restricted to low status social roles.
Recent research on self-esteem and academic achievement in black adolescent males suggests that self-esteem in this group is not connected to academic achievement (such behavior is viewed as acting ‘white’). Cross cultural comparisons of castes assimilated through conquest (Irish in Britain, Koreans in Japan, Maori in New Zealand, Blacks in the U.S.) have more difficulty assimilating than those who choose to immigrate.
Interestingly, the arbitrary designation of caste for the Indian untouchables and the burakumin in Japan suggest race matters to the extent people want it to matter. Such data are at least suggestive of the caste hypothesis (which is much more complicated than the more simply defined SES) and is worth further empirical investigation.
Fitting the data to the theory?
The presentation of statistical analyses by Herrnstein and Murray suggests that they are deliberately manipulative with regards to the data, disingenuous, and seek to fit the data to the theory rather than vice versa. Specifically, in multiple regression equations, with regards to the SES data, they hold IQ constant and then consider the relationship of social behaviors to parental SES.
They then hold SES constant consider the relationship of the same behaviors to IQ. In general they find a higher correlation with IQ than SES. This is acceptable as far as it goes but as Gould (1994) astutely observes, Herrnstein and Murray plot only the regression curve but not the scatter around the curve (i.e.,variance due to IQ and social factors).
When more thoroughly considered, the relationships that they proposed, based on their own data, are weak. Specifically they report R2 (the goodness of fit) in the appendix where few will ever see it.
Because their entire argument is based on an R2 =.4 which, simply stated, predicts approximately 16% of the variance. Hardly a compelling statistic on which to base significant social policy as Herrnstein and Murray suggest or of the much nastier eugenics suggestions as Miller proposes. A vast majority of the R2 measures excluded from the main body of the text are less than 0.1 and hidden in appendix 4. Their own data make the conclusions of the Bell Curve simply indefensible.
The Bell Curve is a simple treatise of conservative ideology, like Miller’s review, and in its biased treatment of data its purpose is revealed as simple advocacy of a particular position. Nothing new about that. The problem is that Herrnstein and Murray and those who blindly accept their data in the absence of scientific skepticism, misuse or misunderstand science to further their agenda. There may be inherited group differences in intelligence.
However, Herrnstein and Murray have failed to provide adequate refutation of the alternative hypotheses.
Miller, E. (1997) Race, Socioeconomic Variables, and Intelligence: A Review of the Bell Curve.
Fischer, C.S. Hout, M. Jankowski, M. Lucas, S.R. Swidler, A. & Voss, K.(1996) Inequality by Design. Princeton: Princeton University Press.
Gould, S.J. (1981) The Mismeasure of Man. New York: Norton Gould, S.J. (1994) Curveball. New Yorker, February.
Herrnstein, R. & Murray (1994) The Bell Curve: Intelligence and Class Structure in American Life. New York: Free Press.
Neisser, U. (chair) et al. (1996). Intelligence: Knowns and Unknowns. American Psychologist, 51,77-102.