"…the ultimate aspirational goal of human intelligence research is to increase general intelligence (g) in individuals " (Gargus & Haier, 2025).
No, let us stop you there: for us, the aim of the research is to understand the mechanisms whereby individual differences in intelligence arise and, relatedly, to give an account of why some people’s brains and thinking skills develop/grow, maintain, and age better than others’ do, because this has relevance to people’s well-being and longevity. So, in brief, we’re interested in keeping our marbles rather than increasing their number (though there are probably positive benefits to doing both). Once we understand the mechanisms whereby individual differences arise we might be able to begin preventing the decline caused by ageing and disease. This quibble notwithstanding, theirs is a welcome article. We say that partly-narcissistically, because we wrote an article covering this field a few years ago (Deary et al., 2022). We shall not repeat it here. It merits reading, though—even if not for its own sake—because it provides a complement to Gargus’s and Haier’s (Gargus & Haier, 2025). We think ours is less optimistic. Instead of focusing on evidence that might support one of many possible evolutionary psychology hypotheses, we adopted an integrated view of systems biology to reflect the vast and difficult-to-synthesise literature.
Gargus’s and Haier’s paper has positives. It collates results that were not available until recently. It is extensive, which underlines the expanding depth and breadth of relevant studies and the complexity of intelligence’s biological underpinnings. The study of the origins of intelligence differences has made progress, from the evanescent/maybe-ultimately-imaginary ‘cognitive processes/components’ of the second half of the 20th century (Deary, 2000; Sternberg, 2022). The reductionist effort in intelligence differences now tends to emphasise individual differences in measurable and replicable aspects of brain structure and function and—and it’s been a long time coming—variance in DNA and RNA sequences and structure, and expression levels of RNA and proteins.
We remember when that wasn’t so. It was not long ago that any non-null association between brain measures (now mostly collected from magnetic resonance neuroimaging) and cognitive tests’ scores was contested. It was not until a decade into the present century that associations between DNA variation [we mean genome-wide association studies and not candidate gene studies (Chabris et al., 2012)] and mental tests’ results were emerging (Davies et al., 2011). Both areas have now matured, as the authors describe, though not necessarily as we might articulate them. Gargus and Haier collate evidence from the current high-tide contour of published work for their partly-evolutionary psychology approach. They try to make sense of complex findings from evolutionary biology, genetics, and neuroscience by nominating a view related to metabolic demands of the brain. Competing/complementary views may be similarly worth looking at; Geary (Geary, 2018) placed his chips on the mitochondrion, though Schubert’s and Hagemann’s (Schubert & Hagemann, 2020) opinion was that there were other important biological loci with respect to intelligence differences. We are not sure that any of these chosen biological loci of interest have enough evidential weight; if their loci are relevant to intelligence differences, we don’t know how much of that variance they could claim; we don’t know how this field will change [new techniques and ways of looking will offer biological constructs whose variance we can’t yet conceive; one can’t make explanatory progress until the relevant constructs are available (Craver, 2007)]; and if we should write ‘progress’ instead of ‘change’. First, let’s look back.
There are different ways in which people discuss the state of the present results that try to explain human intelligence differences. When we wrote our article (Deary et al., 2022) a well-known critic of the field seemed to praise its competence/professionalism (something like that) but wrote that what we summarised had been known in essence for a couple of decades. Odd, that, in so far it was at odds with our memories of the field.
Genetics and mechanistic explanation(s)
Take the genetics part, for example. We’d gone through the discovery-then-non-replication (with the genetic variants of the gene for Apolipoprotein E providing a glaring exception to light up the failures-to-replicate) blight of the candidate gene studies in intelligence and cognitive ageing (Chabris et al., 2012; Payton, 2009). Confronted with a replication crisis that revealed most candidate gene studies to be misleading—and this affected, for example, depression as well as intelligence (Border et al., 2019), and was a problem more generally in biomedical research (Colhoun et al., 2003)—we learned to proceed cautiously even in the presence of statistically-significant candidate findings. We genuinely did not know how many participants with good phenotyping and genotyping would be needed to find some positive results in a GWAS of intelligence. Some might still be looking for candidate genes of relatively large effect, but the discussions became about oligogenic or polygenic (leaving aside for now the ultimate omnigenic possibility) outcomes (Boyle et al., 2017). And we did not know how many participants we should need: we assumed it would be thousands, but we did not know if it might be tens or hundreds of thousands or even into seven figures. First, though, we went with thousands (Davies et al., 2011).
Doing the first apparently-decently-sized GWASs on intelligence test scores was like waiting for a picture to develop in a dark room, or waiting for the sun to come up in the pre-dawn. It was exciting: que sera, sera—this was hypotheses-free work. An image developed/light shone: there were no large effects, and at least tens/hundreds of thousands of participants would be needed. Intelligence differences were highly polygenic (Davies et al., 2011). Also relying on genome-wide data, there was the interim advance of applying the genetic complex trait analysis-genetic relationship matrix analysis procedure (GCTA-GREML) (Yang et al., 2010) to find a heritability estimate for intelligence (Davies et al., 2011) that was based on nucleotide polymorphic variation and not twins/adoptees/pedigrees. Compared to GWAS, a GCTA estimate of heritability summarises the joint effect of all SNPs (rather than SNPs individually) which gave a broader genome-wide view to investigating the underlying biology of cognitive ability. From several studies that have included hundreds of thousands of participants (Davies et al., 2018; Savage et al., 2018) (note that these are not independent—they share some participants, quite a few actually) there are four bottom lines: there are hundreds of very-small-effect-sized genetic variants that associate significantly with intelligence differences; genome-wide variants collectively account for some trait variance [‘SNP-heritability’, it’s lower than the estimate from twins and people argue about why (Alexander, 2021), and within-family GWAS estimates sometimes make it lower still (Young et al., 2022)]; we have the capability to predict a modest amount of intelligence variance in a hold-out sample just using DNA information (3-4%, or sometimes a bit more, via polygenic risk scores); and there’s the fun of finding out the other variables with which intelligence has genetic correlations, revealing complex between-trait pleiotropy.
These four genetic findings—GWAS hits, SNP-based heritability, polygenic prediction, and SNP-based genetic correlations—are likely to be found, discussed, and cited in most genetic studies of intelligence and other complex traits. Do they tell us about the mechanisms of intelligence differences? Gargus and Haier think the GWAS hits do, and we shall return to that in the next paragraph. Before that, we take a roadside break to set down a curiosity. That is, in most GWAS analyses reports—including of intelligence—one will see many more results that extend beyond identifying GWAS hits, and without having done these one would likely not be published because they deliver grounds for biological interpretation. Though, here things get a little less confident and more muddy. The replication crisis has taught us that GWAS hits may be false positives, so we look for clusters of nearby variants, loci and genes that collectively show strong associations with cognitive ability (i.e., towers in a Manhattan plot). Techniques like SuSIE (Cui et al., 2024; Wang et al., 2020), or FINEMAP (Benner et al., 2016) account for genomic interdependence (‘linkage disequilibrium’) to identify the most likely causal variant within a locus. That most GWAS hits are in non-coding genomic regions complicates interpreting findings (Farh et al., 2015). Selection bias (Schoeler et al., 2023), population stratification (Howe et al., 2022), indirect genetic effects, and many analytical decisions (even adjusting GWAS analyses for heritable confounders (Aschard et al., 2015)) can induce false positives. Sophisticated and continuously-developing approaches attempt to triangulate results by accumulating evidence across analysis techniques with different assumptions which integrate external biological data sources (e.g., colocalization (Giambartolomei et al., 2014), transcriptome-wide association studies (Mai et al., 2023), and cell-type and tissue-enrichment analyses (Leeuw et al., 2015)). Examples of integrated workflows are FUMA (Watanabe et al., 2017) and, more recently, FLAMES (Schipper et al., 2025). These results are all, it would appear, mechanistic—they bore into the biological explanations more than do just GWAS hits results—and so we might look through these in the papers’ results for our understanding of intelligence. What’s odd is that, although these results usually take up a lot of space, they remain often uncommented-on beyond the papers or from paper-to-paper, far less than the four more-emphasised, broad-view results recounted at the start of this paragraph. It’s almost as if what they told us was not incremental, perhaps being obvious, or too crude to be of further use, or thought not to be replicable/robust. Maybe the collective still remembers congratulating one another other for finding another candidate gene. When every new GWAS that is published with a larger sample size finds more associated variants, we look forward to developments that will cast the biology into clearer relief than a stubborn desire to focus on SNP/gene ‘hits’ alone. When might it be claimed that understanding a fuller picture of the complex genetic architecture of intelligence has been achieved? Well, there is perhaps one trait—height—for which scientists are boldly claiming to have uncovered a saturated map of independent, common genetic variants (12,111 of them—imagine the complexity of the mechanistic narrative of individual differences in height that this might imply) at a sample size of 5.4 million (Yengo et al., 2022).
OK, back to the main narrative. A sun—one genetic sun, perhaps just the first of a few or many suns—has shone, albeit weakly. The genetic variants related to intelligence differences will be—we don’t know them all yet—huge in number (we are deliberately vague here) and each of them has a miniscule effect and we often don’t know the causal variants or what the functional consequences of the variants are, or how they work with the other variants and environmental/lifestyle influences. Nor do we know how their epigenetic architectures (and their interactions) affect those genetic functions and their transcription to RNA, translation to protein, and so on, onward somehow to brain-performance differences. The big question at the moment is what we think all this tells us and what do we do next (yes, that’s two questions). There are responses to the ‘what does it tell us’ that range from optimism to our local version of the gloomy prospect (Davey Smith, 2011; Plomin & Daniels, 1987). There’s derision—in addition to the fair-enough scientific critique—waiting for anyone who is optimistic, and perhaps many are too timid to poke their heads above the parapet. We waited hopefully for the sun to rise on the DNA-based studies of intelligence (and some other traits). We’ll clunkily change metaphor and say that we felt like we had been following the course of a river and at the sun-up point we found it divided into innumerable distributaries, too many ever to follow (the present authors disagree about the “ever”, some thinking that AI might come to the rescue). Nevertheless, through the gloaming of our scepticism, we keep in mind Gargus’s and Haier’s stated aim here: to outline and point to the sorts of things that might excite debate and promote collaborative work between psychologists/psychometricians and molecular folks; we hope here we are putting our backs into that endeavour.
We wondered whether the opener to their first paragraph on Molecular Genetic Research and Intelligence section was a mis-phrasing, since it accidentally appears to beg the question (concerning the difference between twin-based and DNA-based estimates of the heritability of intelligence). That is, we hope that they will not only consider results conclusive when GWAS results do show strong evidence of heritability. Gargus and Haier made the decision to list some of the many variants that associate with cognitive function test scores and describe their functions. We do not think those few selected GWAS hits—of which there are hundreds more—give definitive mechanistic insights, or that their significance can be summarised through a narrow top-down evolutionary psychology lens. Intelligence is so polygenic with each genetic variant contributing such a small amount of variance that we doubt that it is helpful to discuss specific genes at this stage in the science. The polygenicity of intelligence differences thus seems to rather fly in the face of their final statement of this section, where they contend that even small changes across billions/trillions of cerebral machineries should translate into large differences. This is not what the science tells us so far.
The first gene discussed is CADM2; Gargus and Haier open this section with the statement: “CADM2 is one of the most significant common-variant loci in large GWAS of educational attainment (EA) and intelligence (Savage et al., 2018)”. Savage et al., appreciating the high polygenicity and biological complexity of intelligence, chose not to discuss specific genes in their manuscript, focussing instead on the results of gene-set analyses and genetic correlations with other human behaviours and diseases. In Gargus’s and Haier’s review, a more careful discussion of converging evidence pointing towards broader biological pathways and mechanisms identified by more integrated workflows investigating the full genome-wide results from specific publications would perhaps have been more insightful.
GWAS results have produced no fingerpost, no causal account of the part of intelligence differences played by genes. The polygenic outlook has the shock of the sublime, but we should keep our heads. We have learned that more confident conclusions require huge sample sizes, stable replication, triangulation, a broad integrated system-level view. We have made the discovery that we are faced with a multi-multi-variate cat’s cradle of tiny influences. Something is happening at the genetic level and downstream to influence how well the brain works; can we ever, in principle, find out what’s going on?
We know more than we did; we have some highly-complex adumbration of what is the case. Think about how long it took to sequence the first human genomes, and at what cost. Think how hard it was and how long it took to find out how proteins folded before AlphaFold. And how much of our terror of the sublime problem of the functional consequences of many tiny variants will be mitigated by tools such as AlphaGenome or its, perhaps multivariate, successors (Avsec et al., 2025). We shall not be able to condense the complexity of the problem into one simplistic explanation. What we have argued about the genetic evidence to date, and how mechanisms are elusive, is summed up by Ota et al. (Ota et al., 2025) as follows, “But despite these successes, interpreting the vast majority of [genetic] associations remains challenging. Aside from coarse-grained analyses such as identifying trait-relevant cell types and enriched gene sets, we lack genome-scale approaches for interpreting the molecular pathways and mechanisms through which hundreds, if not thousands, of genes affect a given phenotype.” Using some blood traits as examples, Ota et al. use, “loss-of-function burden tests with gene-regulatory connections inferred from Perturb-seq experiments in relevant cell types… to build causal graphs… of the gene-regulatory hierarchy that jointly controls three partially co-regulated blood traits”. To us, this looks like a start.
Other suns might rise. Our now-faded hope over expectation was that, like Rilke’s storm in Den Schauende (The Seer), GWAS would work away…
“Till the landscape like the Psalter’s page
Opens out luminous and sublime.”
Although we have less expertise in evolutionary biology, we think that a couple of things regarding Gargus’s and Haier’s discussion of hominid brain expansion and human accelerated regions (HARS) are worth mentioning. In Figure 1 they state that, “70k years ago, Ice Age conditions exposed a unique ecological niche as one of the final hominid refuges at the tip of South Africa, where massive global glaciation exposed rich mussel beds, providing abundant, dependable nutrition and PUFA lipids required for brain growth. This set the stage to permit the HAR element mutations, defining the unique modern human developmental program, to successfully expand the hominid brain without causing a lethal ‘blackout’ of the metabolic power grid.” First, it is well established that the increase in hominin brain occurred much earlier than this and that humans had near-modern brain sizes by 300k years ago. Changes in brain shape, but not size, continued up to 100k-35k years ago. There is evidence for an increase in cognitive flexibility rather than raw brain size at this time (Neubauer et al., 2018). Second, GWASs have not identified polymorphisms within HARs with large effect sizes on human intelligence, suggesting that they are not a major influence of variation in modern human intelligence (Davies et al., 2018; Savage et al., 2018).
Brain imaging
The authors also turn to the organ of thinking to identify other biological reasons for cognitive differences. They are proponents (and one of them is a co-originator) of the Parieto-Frontal Integration Theory of intelligence (Jung & Haier, 2007), which highlights a distributed network of regions that is hypothesised mutually to scaffold our more complex cognitive functions. In a mirror image of the genetics sections, imagine our surprise to see one specific brain area (the precuneus) receiving so many column inches. In fairness, an easily-missed brief sentence in this section reminds us that their slight sidelining of the P-FIT and DMN and foregrounding of the precuenus is all with the aim of stimulating directions for new research (seeding precuneus connectivity to see if we can expand or improve upon existing accounts, for example). Nevertheless, it never quite landed for us why the precuneus was the focus of attention here. Specific patterns of deficits following damage to any given brain region do not, in isolation, preclude other regions from also yielding similar, or even slightly different-but-still-g-related cognitive profiles/deficits. There are other frontal and temporal meta-modal integrative hubs (think frontal and temporal poles). An argument for the special position of one region should have evidence to the exclusion of others from various neuropsychological methods (lesion studies, functional and structural imaging, tCDS, etc.). In keeping with the spirit in which the authors write this article, we suggest that a more extensive multi-modal canon of evidence would be required to reify the precuneus above other such regions.
The final part of the section that deals with brains seems to end abruptly with a list of brain volume-g correlations. We reason that the point might have been to observe we could be using brain-based prediction to validate our hypotheses (as above), and augment our understanding of why brain volume, and other neurobiological characteristics, might be related to cognitive differences. For what it’s worth, there is lots about the brain that each modality alone does not capture (volumetry doesn’t directly tell us about firing, receptor functioning or densities, oligodendrocytic coverage and so on). Given all of the foregoing (and that includes the ground that the authors have covered), it’s quite impressive that just gross brain volumetry performs so well [and yes, we have done some of this work (Cox et al., 2019)]. If we want to use prediction as a check about our ability to understand the sorts of things that are related to differences in g, we need to assemble neurobiological information from as many sources and modes as possible. We then need to apply various advanced statistical methods to understand the common (mediation) pathways through which the direction of biological action progresses (genetics, epigenetics, transcriptomics, and so on, which all will have within and between mode interactions and non-linearities) and also the unique additive contributions that these layers of information add. The additional issue here, of course, is that these layers and the way they interact is likely to be age-moderated across the life course, increasing the complexity of this undertaking. In that context, progress thus far isn’t entirely gloomy, but has a long way yet to go. We would have appreciated this sort of forward-facing commentary to round off this section.
Summing up
In Gargus’s and Haier’s (Gargus & Haier, 2025) piece there is little discussion of the growing fields of proteomics and methylomics, both of which combine genetic and environmental effects on intelligence. Many indicators of DNA methylation and proteins are implicated in intelligence differences (Argentieri et al., 2024; Conole et al., 2025; Harris et al., 2020; Smith et al., 2024), and their interrelations are complex with the brain, DNA, and the environment. We and others are using methods that focus more on brain-wide patterning (Hansen et al., 2021; Moodie et al., 2025; Writing Committee for the Attention-Deficit/Hyperactivity Disorder et al., 2021) than on p-value-centric views of region-only replication. Similarly, in this genetic context, genomic structural equation modelling (Grotzinger et al., 2023) and polygenic score approaches potentially hold more promise in terms of understanding the multi-omic milleu that gives rise to cognitive differences, rather than focussing on single genes or single brain ROIs/voxels. These might offer a similar, apparently-unresearchable complexity (the gloomy prospect again) but, as Rilke’s The Seer ends,
“Triumph no more spreads beckoning wings.
His growth is: Profoundly conquered to be
By ever greater things.”
We summarised our appreciation of and reservations about Gargus’s and Haier’s paper. We judge that they have done what they set out to do—‘to boldly go’ and interpret an increasingly-complex literature with the aim of exciting further discussion about how we extract a coherent picture that can be put to the test. We also hope we have held up our end of the bargain. If we were gloomy-prospectors we’d say that standing in our research field felt like T. S. Eliot did when he wrote the wasteland, i.e., trying to find order from apparent chaos and not give in to despondency. We explained our reservations and tendency to gloominess—by contrast with Gargus’s and Haier’s upbeat, helpful article (they have a story; we don’t)—to ChatGPT, which replied with something which we have placed in the Appendix; we hope you and G&H enjoy it.
Author contributions
All authors contributed equally to the writing and editing of the paper.
Funding sources
No funding directly supported the writing of the paper. I.J.D. is supported by grants from the National Institutes of Health (NIH; R01AG054628 and U01AG083829) and by BBSRC and ESRC (BB/W008793/1). S.E.H. is supported by grants from NIH (U01AG083829), the BBSRC and ESRC (BB/W008793/1), and the BBSRC (UKRI1941). A.E.F. is supported by a grant from NIH (R01AG073593). S.R.C. is supported by a Sir Henry Dale Fellowship jointly funded by the Wellcome Trust and the Royal Society (221890/Z/20/Z).
Data availability statement
Not applicable.
Conflicts of interest
The authors have confirmed that no conflict of interest exists.