he Bell Curve: Intelligence and Class Structure in American Life, by the late Richard J. Herrnstein and Charles Murray (1994), is about the implications of individual and group differences in intelligence for educational and public policy. According to Herrnstein and Murray:
To try to come to grips with the nation's problems without understanding the role of intelligence is to see through a glass darkly indeed, to grope with symptoms instead of causes, to stumble into supposed remedies that have no chance of working. (pp. xxii-xxiii)
Briefly, they argued that, because intelligence is largely inherited and substantially immutable, social programs based on environmentalist ideals (e.g., Affirmative Action) are unlikely to succeed and should be drastically altered or eliminated. Herrnstein and Murray's recommendations for educational policy include: increasing the funding of programs for the intellectually gifted (at the expense of programs for those with learning difficulties); allowing all parents to choose the schools their children attend; and emphasizing the classical notion of the "educated person" as the overarching goal of public schools.
The Bell Curve received an incredible amount of almost entirely negative attention in the media, mainly because it addressed, in part, the controversial issues of the heritability of intelligence and racial- ethnic group differences in cognitive ability. According to Herrnstein and Murray, the possibility that group differences in intelligence are partly responsible for many of the social and economic inequalities that exist across racial-ethnic groups has long been considered too sensitive to discuss in public, yet badly needs airing. Many have disagreed, asserting that the book will only fan the flames of racism and exacerbate ethnic balkanization. Criticism of The Bell Curve has been harsh. In The New Republic (October 31, 1994), for example, respondents referred to it as "pseudo-scientific racism" and "errant nonsense. " One commentary was even entitled, "Neo-Nazis!" Stephen Jay Gould (1994), the Harvard paleontologist who wrote The Mismeasure of Man, characterized Herrnstein and Murray's policy recommendations as " anachronistic social Darwinism" (p. 138) and called the book a "manifesto of conservative ideology" (p. 141).
For better or worse, the arguments presented in The Bell Curve are being widely contemplated, if book sales and media coverage are any indication. My intention in this commentary is not to review The Bell Curve in its entirety. Space limitations preclude a thorough discussion of all the complex issues addressed by Herrnstein and Murray. Instead, two important components of The Bell Curve will be examined. The first is the original empirical evidence presented by Herrnstein and Murray to demonstrate the central role of intelligence in American life. The second is Spearman's g. Although not discussed in depth in The Bell Curve, the g factor is, as Gould (1994) noted, "the sine qua non of their entire argument" (p. 143).
Herrnstein and Murray provided considerable empirical support for their detailed arguments in The Bell Curve. A significant portion of their 845-page book presents the results of original analyses of data from the National Longitudinal Survey of Youth (NLSY). The NLSY began in 1979 as a nationally representative sample of 12,686 individuals between 14 and 22 years of age, each of whom has been resurveyed annually for the past 15 years. The demographic information gathered for the NLSY includes the youths' childhood environment and parental socioeconomic status (SES), as well as their subsequent educational and occupational attainment, work history, and family formation. Participants in the NLSY also have been administered a number of intelligence and cognitive ability measures. As stated in The Bell Curve, the NLSY is "the mother lode for scholars who wish to understand the relationship of cognitive ability to social and economic outcomes" (p. 119).
Herrnstein and Murray analyzed the NLSY data with multiple regression. In most of their analyses, they used intelligence and SES to predict important social and economic outcomes. Herrnstein and Murray's main objective was to determine the relationship between intelligence and these outcomes after controlling for socioeconomic background, and vice versa. They asserted that, "if the role of IQ remains largely independent of SES, then it is worth thinking about, for it may cast social behavior and public policy in a new light" (p. 123). To measure cognitive ability, they used the Armed Forces Qualification Test (AFQT), a highly g-loaded instrument with excellent psychometric properties. To assess socio-economic background, Herrnstein and Murray created a new index based on family income, education, and occupation. They re-potted a Cronbach's alpha reliability coefficient of .76 for this measure, but no evidence of validity.
According to Herrnstein and Murray, results of these analyses underscore the predominant influence of intelligence in determining wealth, poverty, and social status. They asserted that their findings also demonstrate the integral relationship between intelligence and a host of important social behaviors, such as parenting, welfare dependency, crime, child neglect, and illegitimacy, among others. Herrnstein and Murray concluded that intelligence is a much better predictor of these behaviors than socioeconomic background. In fact, after controlling for intelligence, they found that the relationship between socio -economic background and many social, educational, occupational, and economic outcomes was negligible. Despite the fact that the substantive conclusions reached by Herrnstein and Murray in The Bell Curve rest largely on the analysis of one set of data with one statistical technique, their findings cannot be easily dismissed. Not only did they analyze an excellent set of data with appropriate statistical techniques, but their findings are generally consistent with the results of past studies (see Brody, 1992; Jensen, 1980, 1993a, for reviews).
Nevertheless, Herrnstein and Murray's interpretation of these statistically significant results has been subject to criticism. Judis (1994), for example, argued that "Murray and Herrnstein acknowledge the difference between demonstrating correlation and causation, but consistently use the language of causation when they merely have demonstrated correlation" (p. 18). As a result, he maintained, "a distorted picture of social change emerges" (p. 18) in their recommendations for public policy. Gould (1994) raised the issue of practical versus statistical significance of empirical results. He asserted that:
In violation of all statistical norms that rye ever learned, they plot only [emphasis in the original] the regression curve and do not show the scatter of variation around the curve, so their graphs do not show anything about the size of the relationship -that is, the amount of variation in social factors explained by IQ and socio-economic status. (pp. 145-146)
After examining the analysis of variance tables in Appendix 4 of The Bell Curve, Gould concluded that "their own data indicate that IQ is not a major factor in determining variation in nearly all the social behaviors they study" (p. 146). He stated that "although low figures are not atypical for large social-science surveys involving many variables, most of Herrnstein and Murray's correlations are very weak -- often in the 0.2 to 0.4 range" (p. 147). Gould also accused Herrnstein and Murray of "pervasive disingenuousness" (p. 140), partly because information on the variance explained in their regression analyses is "tucked away" in an appendix, instead of discussed in the text along with graphic presentation of their results.
Although Herrnstein and Murray did consider the fact that substantial dispersion is bound to exist around regression lines for moderately correlated variables, they argued that "the exceptions [to the general relationship] do not invalidate the importance of a statistically significant correlation" (p. 68). They also provided a rationale for deemphasizing the variance accounted for by the independent variables in their analyses. As Herrnstein and Murray stated:
A crucial point to keep in mind abut correlation coefficients... is that correlations in the social sciences are seldom much higher than .5 (or lower than-.5) and often much weaker -- because social events are imprecisely measured and are usually affected by variables besides the ones that happened to be included in any particular body of data. A correlation of .2 can nevertheless be "big" for many social science topics. In terms of social phenomena, modest correlations can produce large aggregate effects. (p. 67)
Given Herrnstein and Murray's focus on the implications of their results for educational and public policy, emphasizing the interpretation of statistically significant regression slopes rather than the variance explained seems reasonable. In addition, many researchers in the social and behavioral sciences would not agree with Gould's (1994) description of correlations in the 0.2 to 0.4 range, which explain between .4 and 16% of the variance, as "very weak." According to Cohen (1977), a "large" effect explains more than 15% of the variance, a "medium" effect about 6%, and a "small" effect about 1%. On this "scale," Herrnstein and Murray's results are best described as medium to large. Regardless of the label one attaches to the amount of variance explained in Herrnstein's and Murray's analyses, it is important to note that these are still correlational results. Causality is not established by the existence of a correlation between variables.
One potential problem with the multiple regression analyses reported in The Bell Curve that has not been mentioned in prior reviews concerns measurement error. Reliability of measurement affects correlations. The maximum correlation that can be obtained between any two variables is the square root of the product of their reliabilities. When the reliability of either or both is less than perfect, the correlation will be attenuated. Because some degree of measurement error is the rule rather than the exception, correlations usually underestimate the true degree of association between variables. Fortunately, they can be corrected for measurement error to estimate the true population value. Correlations corrected for attenuation are larger than uncorrected correlations, sometimes substantially so.
In contrast to the predictable effect of measurement error on zero- order correlations, in multiple regression the effect of measurement error is unpredictable. True regression parameters will be either overestimated or underestimated as a function of measurement error, depending upon the pattern of correlations among the variables and their reliabilities (Pedhazur, 1982). The reliability of a variable partialed out in multiple regression is especially important. Regarding the independent variables used by Herrnstein and Murray, the reliability of the AFQT is quite high, but the reliability of their index of socioeconomic background is only moderate (i.e., less than 0.80). Nevertheless, Herrnstein and Murray apparently did not correct for attenuation while conducting their multiple regression analyses. Results of their analyses of the NLSY data, therefore, must be viewed with caution until the unpredictable effect of measurement error has been examined, particularly when controlling for socioeconomic background.
The Bell Curve is based on six conclusions about tests of cognitive ability that, according to Herrnstein and Murray (1994), are beyond significant dispute in the scientific literature:
1. There is such a thing as a general factor of cognitive ability on which human beings differ. 2. All standardized tests of academic aptitude or achievement measure this general factor to some degree, but IQ tests expressly designed for that purpose measure it most accurately. 3. IQ scores match, to a first degree, whatever it is that people mean when they use the word intelligent or smart [emphases in the original] in ordinary language. 4. IQ scores are stable, although not perfectly so, over much of a person's life. 5. Properly administered IQ tests are not demonstrably biased against social, economic, ethnic, or racial groups. 6. Cognitive ability is substantially heritable, apparently no less than 40 percent and no more than 80 percent. (pp. 22-23)
Herrnstein and Murray derived these six conclusions from what they referred to as the classical paradigm of intelligence research and theory. Classicist researchers maintain that the structure of mental ability is best described as hierarchical, with Spearman's g at the apex (e.g., see Carroll, 1993). Although g is recognized as but one of many factors of cognitive ability, they contend that it is the most important one. As Jensen (1992a) stated:
In recent years, the study of general mental ability [emphasis in the original], or g, has begun to look as a science should. Along with the increasing realization of the tremendous importance of this subject, there has been an unusually rapid growth of theoretical and empirical research, both psychometric and experimental. (p. 271)
Herrnstein and Murray agreed with researchers in the classical paradigm on the importance of g, yet did not discuss this central concept in sufficient detail. My aim here is to sketch some of the most fundamental things known about g, so that readers of The Bell Curve may better evaluate its theoretical foundation.
Spearman's g is a scientific construct, not a real thing that resides in the brain. It is used to explain an observable phenomenon known as the "positive manifold," which is the empirical fact that correlations between tests of cognitive ability are almost always positive. The positive manifold indicates that a common source of variance underlies individual differences in all tests and performances involving cognitive ability. Spearman (1994) invented factor analysis to measure g empirically. For any particular battery of mental tests, the g factor is estimated equally well by the first (unrotated) principal factor, the first principal component, the single highest-order factor in a Schmid-Leiman hierarchical factor analysis, and LISREL methods of factor analysis (Jensen & Weng, 1994; Ree & Earles, 1991). Estimates of g also are quite stable. According to Thorndike (1987), the g factor extracted from any large and varied battery of mental tests will be essentially the same g.
No test or performance involving cognitive ability measures only g, however. The nonerror variance of most tests also reflects specificity and group factors. Nevertheless, g accounts for more variance than any other independent component of individual differences in tests of cognitive ability (see Jensen, 1980). In fact, the g factor often accounts for more variance than all other group factors combined, even when tests are designed without the expressed intent of measuring g. For example, Kranzler and Weng (in press) recently found that a battery of tests developed by Naglieri, Das, Stevens, and Ledbetter (1991) to measure the constructs of the planning, attention, and simultaneous- successive (PASS) processes theory of human cognition primarily measured g. The second-order g accounted for 59.3% of the common factor variance among the PASS tests, which is considerably more than that accounted for by the g of conventional IQ tests, such as the Weschler Intelligence Scales for Children-3rd Edition (see Sattler, 1992).
Another well-established fact about g is that it is related to processing complexity. Cognitive tasks' g-loadings are an increasing monotonic function of the complexity of information processing required for successful task completion. Loadings on g are not related to the task' s surface characteristics. As Jensen (1992a) asserted, "The knowledge and skill content of performance on mental ability tests is merely a vehicle for g, which reflects the overall capacity and efficiency of information processes by which the knowledge and skill are acquired and used" p. 275, emphases in the original).
The importance of g lies in the fact that it correlates substantially with phenomena outside the domain of psychometric tests and factor analysis. Psychometric g is therefore not simply a mathematical artifact. For example, g is more predictive of outcomes in educational achievement, job training, and job success than any other factor derived from tests of cognitive ability (Jensen, 1992b, 1993a; Ree & Earles, 1992). In addition, Jensen (1993b) argued that the size of the black-white group difference on cognitive tasks is a direct function of the tasks' loading on g. Psychometric g also is related to the heritability (i.e., the proportion of genetic variance) of cognitive tests, which indicates that individual differences in g are determined in part by genetics and therefore influenced by biological functioning (e. g., Bouchard, Lykken, McGue, Segal, & Tellegen, 1990).
A great deal of contemporary research is now aimed at determining the neurophysiological and psychological mechanisms that underlie g (for review, see Jensen, 1992a; Vernon, 1993). The significant correlates of g identified by researchers thus far include: averaged evoked potentials (e.g., Barrett & Eysenck, 1992), nerve conduction velocity (e.g., Vernon & Mori, 1992), speed of neural and synaptic transmission in the visual brain as measured by the positron emission tomography scanning technique (e.g., Haler, Siegel, Crinella, & Buchsbaum, 1993), and the speed and efficiency of elementary cognitive processes (see Vernon, 1990a, for a review). The consistency and coherence of these data reflect real scientific progress that must be explained by any viable theory of intelligence. Many now believe that the results of this research substantiate a neural efficiency model of g (see Vernon, 1993). According to Vernon (1900b), this model postulates that "persons who perform well on intelligence tests (who have high "IQs") have brains that can operate faster and more efficiently than those of persons who perform less well" (p. 295).
In The Bell Curve, Herrnstein and Murray have gone further than most in discussing the implications of individual and group differences in g for educational and public policy. Some would say they have gone too far. Because human abilities obviously comprise more than g, it is misleading, or possibly even hazardous, to hold g as the sine qua non of American life. Policy decisions must be based on more than psychometrics or statistics. Science should inform policy as best it can, but policy also should be based on political, social, and moral considerations.
The Bell Curve should be discussed seriously and rationally by school psychologists, not suppressed. Whether we like it or not, Herrnstein and Murray's data, and the conclusions they draw from them, will not fade away. Although they may not be correct in all their assertions, there are lessons to be learned from The Bell Curve and from how the media reacted to it. The worst possible outcome would be for this discussion not to take place.
Barrett, P T., & Eysenck, H. J. (1992). Brain evoked potentials and intelligence: The Hendrickson paradigm. Intelligence, 16, 361-382.
Bouchard, T. J., Jr., Lykken, D. T. McGue, M., Segal, N., L., & Tellegen, A. (1990). Sources of human psychological differences: The Minnesota study of twins reared apart. Science, 250, 223-258.
Brody, N. (1992). Intelligence (2nd ed.). New York: Academic Press.
Carroll, J. B. (1993). Human cognitive abilities: A survey of factor- analytic studies. New York: Cambridge University Press.
Cohen, J. (1977). Statistical power for the social sciences. New York: Academic Press.
Gould, S. J. (1994, November 28). Curveball. The New Yorker, pp. 139- 149.
Haier, R. J., Siegel, B. V., Crinella, E M., & Buchsbaum, M. S. (1993). Biological and psychometric intelligence: Testing an animal model in humans with positron emission tomography. In D. K. Detterman (Ed. ), Current topics in human intelligence: Individual differences and cognition (vol. 3; pp. 157-170). Norwood, NJ: Ablex Publishing Corporation.
Herrnstein, R. J., & Murray, C. (1994). The bell curve: Intelligence and class structure in American life. New York: Free Press.
Jensen, A. R. (1980). Bias in mental testing. New York: The Free Press.
Jensen, A. R. (1992a). Understanding g in terms of information processing. Educational Psychology Review, 4, 271-308.
Jensen, A. R. (1992b). Commentary: Vehicles of g. Psychological Science, 3, 275-278.
Jensen, A. R. (1993a). Psychometric g and achievement. In B. R. Gifford (Ed.), Policy perspectives on educational testing. Norwell, MA: Kluwer Academic Publishers.
Jensen, A. R. (1993b). Spearman's hypothesis tested with chronometric information-processing tasks. Intelligence, 17, 47-78.
Jensen, A. R., & Weng, L. (1994). What is a good g? Intelligence, 18, 231-258.
Judis, J. J. (1994, October 31). Taboo you. The New Republic, p. 18.
Kranzler, J. H., & Weng, L. (in press). The factor structure of the PASS cognitive tasks: A reexamination of Naglieri et al. (1991). Journal of School Psychology.
Naglieri, J. A., Das, J. P., Stevens, J. J., & Ledbetter, M. F. (1991). Confirmatory factor analysis of planning, attention, simultaneous, and successive cognitive processing tasks. Journal of School Psychology, 29, 1-17.
Naglieri, J. A., Das, J. P, Stevens, J. J., & Ledbetter, M. F. (1991). Continuatory factor analysis of planning, attention, simultaneous, and successive cognitive processing tasks. Journal of School Psychology, 29, 1-17.
Pedhazur, E. J. (1982). Multiple regression in behavioral research 2nd ed.). New York: Holt, Rhinehart, and Winston.
Ree, M., & Earles, J. A. (1991). The stability of convergent estimates of g. Intelligence, 15, 86-89.
Ree, M., & Earles, J. A. (1992). Intelligence is the best predictor of job performance. Psychological Science, 3, 86-89.
Reed, T. E., & Jensen, A. R. (1992). Conduction velocity in a brain neural pathway of normal adults correlates with intelligence level. Intelligence, 16, 259-272.
Sattler, J. M. (1992). Assessment of Children (rev. 3rd ed.). San Diego: J. M. Sattler, Publisher.
Spearman, C. E. (1904). "General intelligence" objectively determined and measured. American Journal of Psychology, 15, 201-293.
Thorndike, R. L. (1987). The stability of factor loadings. Personality and Individual Differences, 8, 585-586.
Vernon, P A. (1990a). An overview of chronometric measures of intelligence. School Psychology Review, 19, 399-410.
Vernon, P. A. (1990b). The use of biological measures to estimate behavioral intelligence. Educational Psychologist, 25, 293-304.
Vernon, P. A. (1993). Intelligence and neural efficiency. In D. K. Detterman (Ed.), Current topics in human intelligence: Individual differences and cognition (Vol. 3; pp. 171-187). Norwood, NJ: Ablex Publishing Corporation.
Vernon, P A., & Mori, M. (1992). Intelligence, reaction times, and peripheral nerve conduction velocity. Intelligence, 16, 273-288.