Beate St Pourcain

Publications

Displaying 1 - 15 of 15
  • Doust, C., Fontanillas, P., Eising, E., Gordon, S. D., Wang, Z., Alagöz, G., Molz, B., 23andMe Research Team, Quantitative Trait Working Group of the GenLang Consortium, St Pourcain, B., Francks, C., Marioni, R. E., Zhao, J., Paracchini, S., Talcott, J. B., Monaco, A. P., Stein, J. F., Gruen, J. R., Olson, R. K., Willcutt, E. G., DeFries, J. C., Pennington, B. F. and 7 moreDoust, C., Fontanillas, P., Eising, E., Gordon, S. D., Wang, Z., Alagöz, G., Molz, B., 23andMe Research Team, Quantitative Trait Working Group of the GenLang Consortium, St Pourcain, B., Francks, C., Marioni, R. E., Zhao, J., Paracchini, S., Talcott, J. B., Monaco, A. P., Stein, J. F., Gruen, J. R., Olson, R. K., Willcutt, E. G., DeFries, J. C., Pennington, B. F., Smith, S. D., Wright, M. J., Martin, N. G., Auton, A., Bates, T. C., Fisher, S. E., & Luciano, M. (2022). Discovery of 42 genome-wide significant loci associated with dyslexia. Nature Genetics. doi:10.1038/s41588-022-01192-y.

    Abstract

    Reading and writing are crucial life skills but roughly one in ten children are affected by dyslexia, which can persist into adulthood. Family studies of dyslexia suggest heritability up to 70%, yet few convincing genetic markers have been found. Here we performed a genome-wide association study of 51,800 adults self-reporting a dyslexia diagnosis and 1,087,070 controls and identified 42 independent genome-wide significant loci: 15 in genes linked to cognitive ability/educational attainment, and 27 new and potentially more specific to dyslexia. We validated 23 loci (13 new) in independent cohorts of Chinese and European ancestry. Genetic etiology of dyslexia was similar between sexes, and genetic covariance with many traits was found, including ambidexterity, but not neuroanatomical measures of language-related circuitry. Dyslexia polygenic scores explained up to 6% of variance in reading traits, and might in future contribute to earlier identification and remediation of dyslexia.
  • Eising, E., Mirza-Schreiber, N., De Zeeuw, E. L., Wang, C. A., Truong, D. T., Allegrini, A. G., Shapland, C. Y., Zhu, G., Wigg, K. G., Gerritse, M., Molz, B., Alagöz, G., Gialluisi, A., Abbondanza, F., Rimfeld, K., Van Donkelaar, M. M. J., Liao, Z., Jansen, P. R., Andlauer, T. F. M., Bates, T. C. and 70 moreEising, E., Mirza-Schreiber, N., De Zeeuw, E. L., Wang, C. A., Truong, D. T., Allegrini, A. G., Shapland, C. Y., Zhu, G., Wigg, K. G., Gerritse, M., Molz, B., Alagöz, G., Gialluisi, A., Abbondanza, F., Rimfeld, K., Van Donkelaar, M. M. J., Liao, Z., Jansen, P. R., Andlauer, T. F. M., Bates, T. C., Bernard, M., Blokland, K., Børglum, A. D., Bourgeron, T., Brandeis, D., Ceroni, F., Dale, P. S., Landerl, K., Lyytinen, H., De Jong, P. F., DeFries, J. C., Demontis, D., Feng, Y., Gordon, S. D., Guger, S. L., Hayiou-Thomas, M. E., Hernández-Cabrera, J. A., Hottenga, J.-J., Hulme, C., Kerr, E. N., Koomar, T., Lovett, M. W., Martin, N. G., Martinelli, A., Maurer, U., Michaelson, J. J., Moll, K., Monaco, A. P., Morgan, A. T., Nöthen, M. M., Pausova, Z., Pennell, C. E., Pennington, B. F., Price, K. M., Rajagopal, V. M., Ramus, F., Richer, L., Simpson, N. H., Smith, S., Snowling, M. J., Stein, J., Strug, L. J., Talcott, J. B., Tiemeier, H., Van de Schroeff, M. M. P., Verhoef, E., Watkins, K. E., Wilkinson, M., Wright, M. J., Barr, C. L., Boomsma, D. I., Carreiras, M., Franken, M.-C.-J., Gruen, J. R., Luciano, M., Müller-Myhsok, B., Newbury, D. F., Olson, R. K., Paracchini, S., Paus, T., Plomin, R., Schulte-Körne, G., Reilly, S., Tomblin, J. B., Van Bergen, E., Whitehouse, A. J., Willcutt, E. G., St Pourcain, B., Francks, C., & Fisher, S. E. (2022). Genome-wide analyses of individual differences in quantitatively assessed reading- and language-related skills in up to 34,000 people. Proceedings of the National Academy of Sciences of the United States of America, 119(35): e2202764119. doi:10.1073/pnas.2202764119.

    Abstract

    The use of spoken and written language is a fundamental human capacity. Individual differences in reading- and language-related skills are influenced by genetic variation, with twin-based heritability estimates of 30 to 80% depending on the trait. The genetic architecture is complex, heterogeneous, and multifactorial, but investigations of contributions of single-nucleotide polymorphisms (SNPs) were thus far underpowered. We present a multicohort genome-wide association study (GWAS) of five traits assessed individually using psychometric measures (word reading, nonword reading, spelling, phoneme awareness, and nonword repetition) in samples of 13,633 to 33,959 participants aged 5 to 26 y. We identified genome-wide significant association with word reading (rs11208009, P = 1.098 × 10−8) at a locus that has not been associated with intelligence or educational attainment. All five reading-/language-related traits showed robust SNP heritability, accounting for 13 to 26% of trait variability. Genomic structural equation modeling revealed a shared genetic factor explaining most of the variation in word/nonword reading, spelling, and phoneme awareness, which only partially overlapped with genetic variation contributing to nonword repetition, intelligence, and educational attainment. A multivariate GWAS of word/nonword reading, spelling, and phoneme awareness maximized power for follow-up investigation. Genetic correlation analysis with neuroimaging traits identified an association with the surface area of the banks of the left superior temporal sulcus, a brain region linked to the processing of spoken and written language. Heritability was enriched for genomic elements regulating gene expression in the fetal brain and in chromosomal regions that are depleted of Neanderthal variants. Together, these results provide avenues for deciphering the biological underpinnings of uniquely human traits.
  • Neumann, A., Nolte, I. M., Pappa, I., Ahluwalia, T. S., Pettersson, E., Rodriguez, A., Whitehouse, A., Van Beijsterveldt, C. E. M., Benyamin, B., Hammerschlag, A. R., Helmer, Q., Karhunen, V., Krapohl, E., Lu, Y., Van der Most, P. J., Palviainen, T., St Pourcain, B., Seppälä, I., Suarez, A., Vilor-Tejedor, N. and 41 moreNeumann, A., Nolte, I. M., Pappa, I., Ahluwalia, T. S., Pettersson, E., Rodriguez, A., Whitehouse, A., Van Beijsterveldt, C. E. M., Benyamin, B., Hammerschlag, A. R., Helmer, Q., Karhunen, V., Krapohl, E., Lu, Y., Van der Most, P. J., Palviainen, T., St Pourcain, B., Seppälä, I., Suarez, A., Vilor-Tejedor, N., Tiesler, C. M. T., Wang, C., Wills, A., Zhou, A., Alemany, S., Bisgaard, H., Bønnelykke, K., Davies, G. E., Hakulinen, C., Henders, A. K., Hyppönen, E., Stokholm, J., Bartels, M., Hottenga, J.-J., Heinrich, J., Hewitt, J., Keltikangas-Järvinen, L., Korhonen, T., Kaprio, J., Lahti, J., Lahti-Pulkkinen, M., Lehtimäki, T., Middeldorp, C. M., Najman, J. M., Pennell, C., Power, C., Oldehinkel, A. J., Plomin, R., Räikkönen, K., Raitakari, O. T., Rimfeld, K., Sass, L., Snieder, H., Standl, M., Sunyer, J., Williams, G. M., Bakermans-Kranenburg, M. J., Boomsma, D. I., Van IJzendoorn, M. H., Hartman, C. A., & Tiemeier, H. (2022). A genome-wide association study of total child psychiatric problems scores. PLOS ONE, 17(8): e0273116. doi:10.1371/journal.pone.0273116.

    Abstract

    Substantial genetic correlations have been reported across psychiatric disorders and numerous cross-disorder genetic variants have been detected. To identify the genetic variants underlying general psychopathology in childhood, we performed a genome-wide association study using a total psychiatric problem score. We analyzed 6,844,199 common SNPs in 38,418 school-aged children from 20 population-based cohorts participating in the EAGLE consortium. The SNP heritability of total psychiatric problems was 5.4% (SE = 0.01) and two loci reached genome-wide significance: rs10767094 and rs202005905. We also observed an association of SBF2, a gene associated with neuroticism in previous GWAS, with total psychiatric problems. The genetic effects underlying the total score were shared with common psychiatric disorders only (attention-deficit/hyperactivity disorder, anxiety, depression, insomnia) (rG > 0.49), but not with autism or the less common adult disorders (schizophrenia, bipolar disorder, or eating disorders) (rG < 0.01). Importantly, the total psychiatric problem score also showed at least a moderate genetic correlation with intelligence, educational attainment, wellbeing, smoking, and body fat (rG > 0.29). The results suggest that many common genetic variants are associated with childhood psychiatric symptoms and related phenotypes in general instead of with specific symptoms. Further research is needed to establish causality and pleiotropic mechanisms between related traits.

    Additional information

    Full summary results
  • Okbay, A., Wu, Y., Wang, N., Jayashankar, H., Bennett, M., Nehzati, S. M., Sidorenko, J., Kweon, H., Goldman, G., Gjorgjieva, T., Jiang, Y., Hicks, B., Tian, C., Hinds, D. A., Ahlskog, R., Magnusson, P. K. E., Oskarsson, S., Hayward, C., Campbell, A., Porteous, D. J. and 18 moreOkbay, A., Wu, Y., Wang, N., Jayashankar, H., Bennett, M., Nehzati, S. M., Sidorenko, J., Kweon, H., Goldman, G., Gjorgjieva, T., Jiang, Y., Hicks, B., Tian, C., Hinds, D. A., Ahlskog, R., Magnusson, P. K. E., Oskarsson, S., Hayward, C., Campbell, A., Porteous, D. J., Freese, J., Herd, P., 23andMe Research Team, Social Science Genetic Association Consortium, Watson, C., Jala, J., Conley, D., Koellinger, P. D., Johannesson, M., Laibson, D., Meyer, M. N., Lee, J. J., Kong, A., Yengo, L., Cesarini, D., Turley, P., Visscher, P. M., Beauchamp, J. P., Benjamin, D. J., & Young, A. I. (2022). Polygenic prediction of educational attainment within and between families from genome-wide association analyses in 3 million individuals. Nature Genetics, 54, 437-449. doi:10.1038/s41588-022-01016-z.

    Abstract

    We conduct a genome-wide association study (GWAS) of educational attainment (EA) in a sample of ~3 million individuals and identify 3,952 approximately uncorrelated genome-wide-significant single-nucleotide polymorphisms (SNPs). A genome-wide polygenic predictor, or polygenic index (PGI), explains 12–16% of EA variance and contributes to risk prediction for ten diseases. Direct effects (i.e., controlling for parental PGIs) explain roughly half the PGI’s magnitude of association with EA and other phenotypes. The correlation between mate-pair PGIs is far too large to be consistent with phenotypic assortment alone, implying additional assortment on PGI-associated factors. In an additional GWAS of dominance deviations from the additive model, we identify no genome-wide-significant SNPs, and a separate X-chromosome additive GWAS identifies 57.

    Additional information

    supplementary information
  • Price, K. M., Wigg, K. G., Eising, E., Feng, Y., Blokland, K., Wilkinson, M., Kerr, E. N., Guger, S. L., Quantitative Trait Working Group of the GenLang Consortium, Fisher, S. E., Lovett, M. W., Strug, L. J., & Barr, C. L. (2022). Hypothesis-driven genome-wide association studies provide novel insights into genetics of reading disabilities. Translational Psychiatry, 12: 495. doi:10.1038/s41398-022-02250-z.

    Abstract

    Reading Disability (RD) is often characterized by difficulties in the phonology of the language. While the molecular mechanisms underlying it are largely undetermined, loci are being revealed by genome-wide association studies (GWAS). In a previous GWAS for word reading (Price, 2020), we observed that top single-nucleotide polymorphisms (SNPs) were located near to or in genes involved in neuronal migration/axon guidance (NM/AG) or loci implicated in autism spectrum disorder (ASD). A prominent theory of RD etiology posits that it involves disturbed neuronal migration, while potential links between RD-ASD have not been extensively investigated. To improve power to identify associated loci, we up-weighted variants involved in NM/AG or ASD, separately, and performed a new Hypothesis-Driven (HD)–GWAS. The approach was applied to a Toronto RD sample and a meta-analysis of the GenLang Consortium. For the Toronto sample (n = 624), no SNPs reached significance; however, by gene-set analysis, the joint contribution of ASD-related genes passed the threshold (p~1.45 × 10–2, threshold = 2.5 × 10–2). For the GenLang Cohort (n = 26,558), SNPs in DOCK7 and CDH4 showed significant association for the NM/AG hypothesis (sFDR q = 1.02 × 10–2). To make the GenLang dataset more similar to Toronto, we repeated the analysis restricting to samples selected for reading/language deficits (n = 4152). In this GenLang selected subset, we found significant association for a locus intergenic between BTG3-C21orf91 for both hypotheses (sFDR q < 9.00 × 10–4). This study contributes candidate loci to the genetics of word reading. Data also suggest that, although different variants may be involved, alleles implicated in ASD risk may be found in the same genes as those implicated in word reading. This finding is limited to the Toronto sample suggesting that ascertainment influences genetic associations.
  • Schlag, F., Allegrini, A. G., Buitelaar, J., Verhoef, E., Van Donkelaar, M. M. J., Plomin, R., Rimfeld, K., Fisher, S. E., & St Pourcain, B. (2022). Polygenic risk for mental disorder reveals distinct association profiles across social behaviour in the general population. Molecular Psychiatry, 27, 1588-1598. doi:10.1038/s41380-021-01419-0.

    Abstract

    Many mental health conditions present a spectrum of social difficulties that overlaps with social behaviour in the general population including shared but little characterised genetic links. Here, we systematically investigate heterogeneity in shared genetic liabilities with attention-deficit/hyperactivity disorder (ADHD), autism spectrum disorders (ASD), bipolar disorder (BP), major depression (MD) and schizophrenia across a spectrum of different social symptoms. Longitudinally assessed low-prosociality and peer-problem scores in two UK population-based cohorts (4–17 years; parent- and teacher-reports; Avon Longitudinal Study of Parents and Children(ALSPAC): N ≤ 6,174; Twins Early Development Study(TEDS): N ≤ 7,112) were regressed on polygenic risk scores for disorder, as informed by genome-wide summary statistics from large consortia, using negative binomial regression models. Across ALSPAC and TEDS, we replicated univariate polygenic associations between social behaviour and risk for ADHD, MD and schizophrenia. Modelling variation in univariate genetic effects jointly using random-effect meta-regression revealed evidence for polygenic links between social behaviour and ADHD, ASD, MD, and schizophrenia risk, but not BP. Differences in age, reporter and social trait captured 45–88% in univariate effect variation. Cross-disorder adjusted analyses demonstrated that age-related heterogeneity in univariate effects is shared across mental health conditions, while reporter- and social trait-specific heterogeneity captures disorder-specific profiles. In particular, ADHD, MD, and ASD polygenic risk were more strongly linked to peer problems than low prosociality, while schizophrenia was associated with low prosociality only. The identified association profiles suggest differences in the social genetic architecture across mental disorders when investigating polygenic overlap with population-based social symptoms spanning 13 years of child and adolescent development.
  • Vogelezang, S., Bradfield, J. P., the Early Growth Genetics Consortium, Grant, S. F. A., Felix, J. F., & Jaddoe, V. W. V. (2022). Genetics of early-life head circumference and genetic correlations with neurological, psychiatric and cognitive outcomes. BMC Medical Genomics, 15: 124. doi:10.1186/s12920-022-01281-1.

    Abstract

    Background

    Head circumference is associated with intelligence and tracks from childhood into adulthood.
    Methods

    We performed a genome-wide association study meta-analysis and follow-up of head circumference in a total of 29,192 participants between 6 and 30 months of age.
    Results

    Seven loci reached genome-wide significance in the combined discovery and replication analysis of which three loci near ARFGEF2, MYCL1, and TOP1, were novel. We observed positive genetic correlations for early-life head circumference with adult intracranial volume, years of schooling, childhood and adult intelligence, but not with adult psychiatric, neurological, or personality-related phenotypes.
    Conclusions

    The results of this study indicate that the biological processes underlying early-life head circumference overlap largely with those of adult head circumference. The associations of early-life head circumference with cognitive outcomes across the life course are partly explained by genetics.
  • Glaser, B., Gunnell, D., Timpson, N. J., Joinson, C., Zammit, S., Smith, G. D., & Lewis, G. (2011). Age- and puberty-dependent association between IQ score in early childhood and depressive symptoms in adolescence. Psychological Medicine, 41(2), 333-343. doi:10.1017/S0033291710000814.

    Abstract

    BACKGROUND: Lower cognitive functioning in early childhood has been proposed as a risk factor for depression in later life but its association with depressive symptoms during adolescence has rarely been investigated. Our study examines the relationship between total intelligence quotient (IQ) score at age 8 years, and depressive symptoms at 11, 13, 14 and 17 years. METHOD: Study participants were 5250 children and adolescents from the Avon Longitudinal Study of Parents and their Children (ALSPAC), UK, for whom longitudinal data on depressive symptoms were available. IQ was assessed with the Wechsler Intelligence Scale for Children III, and self-reported depressive symptoms were measured with the Short Mood and Feelings Questionnaire (SMFQ). RESULTS: Multi-level analysis on continuous SMFQ scores showed that IQ at age 8 years was inversely associated with depressive symptoms at age 11 years, but the association changed direction by age 13 and 14 years (age-IQ interaction, p<}0.0001; age squared-IQ interaction, p{<}0.0001) when a higher IQ score was associated with a higher risk of depressive symptoms. This change in IQ effect was also found in relation to pubertal stage (pubertal stage-IQ interaction, 0.00049{

    Additional information

    S0033291710000814sup001.doc
  • Munafò, M. R., Freathy, R. M., Ring, S. M., St Pourcain, B., & Smith, G. D. (2011). Association of COMT Val108/158Met Genotype and Cigarette Smoking in Pregnant Women. Nicotine & Tobacco Research, 13(2), 55-63. doi:10.1093/ntr/ntq209.

    Abstract

    INTRODUCTION: Smoking behaviors, including heaviness of smoking and smoking cessation, are known to be under a degree of genetic influence. The enzyme catechol O-methyltransferase (COMT) is of relevance in studies of smoking behavior and smoking cessation due to its presence in dopaminergic brain regions. While the COMT gene is therefore one of the more promising candidate genes for smoking behavior, some inconsistencies have begun to emerge. METHODS: We explored whether the rs4680 A (Met) allele of the COMT gene predicts increased heaviness of smoking and reduced likelihood of smoking cessation in a large population-based cohort of pregnant women. We further conducted a meta-analysis of published data from community samples investigating the association of this polymorphism with heaviness of smoking and smoking status. RESULTS: In our primary sample, the A (Met) allele was associated with increased heaviness of smoking before pregnancy but not with the odds of continuing to smoke in pregnancy either in the first trimester or in the third trimester. Meta-analysis also indicated modest evidence of association of the A (Met) allele with increased heaviness of smoking but not with persistent smoking. CONCLUSIONS: Our data suggest a weak association between COMT genotype and heaviness of smoking, which is supported by our meta-analysis. However, it should be noted that the strength of evidence for this association was modest. Neither our primary data nor our meta-analysis support an association between COMT genotype and smoking cessation. Therefore, COMT remains a plausible candidate gene for smoking behavior phenotypes, in particular, heaviness of smoking.
  • Paternoster, L., Evans, D. M., Aagaard Nohr, E., Holst, C., Gaborieau, V., Brennan, P., Prior Gjesing, A., Grarup, N., Witte, D. R., Jørgensen, T., Linneberg, A., Lauritzen, T., Sandbaek, A., Hansen, T., Pedersen, O., Elliott, K. S., Kemp, J. P., St Pourcain, B., McMahon, G., Zelenika, D. and 5 morePaternoster, L., Evans, D. M., Aagaard Nohr, E., Holst, C., Gaborieau, V., Brennan, P., Prior Gjesing, A., Grarup, N., Witte, D. R., Jørgensen, T., Linneberg, A., Lauritzen, T., Sandbaek, A., Hansen, T., Pedersen, O., Elliott, K. S., Kemp, J. P., St Pourcain, B., McMahon, G., Zelenika, D., Hager, J., Lathrop, M., Timpson, N. J., Davey Smith, G., & Sørensen, T. I. A. (2011). Genome-Wide Population-Based Association Study of Extremely Overweight Young Adults – The GOYA Study. PLoS ONE, 6(9): e24303. doi:10.1371/journal.pone.0024303.

    Abstract

    Background Thirty-two common variants associated with body mass index (BMI) have been identified in genome-wide association studies, explaining ∼1.45% of BMI variation in general population cohorts. We performed a genome-wide association study in a sample of young adults enriched for extremely overweight individuals. We aimed to identify new loci associated with BMI and to ascertain whether using an extreme sampling design would identify the variants known to be associated with BMI in general populations. Methodology/Principal Findings From two large Danish cohorts we selected all extremely overweight young men and women (n = 2,633), and equal numbers of population-based controls (n = 2,740, drawn randomly from the same populations as the extremes, representing ∼212,000 individuals). We followed up novel (at the time of the study) association signals (p<}0.001) from the discovery cohort in a genome-wide study of 5,846 Europeans, before attempting to replicate the most strongly associated 28 SNPs in an independent sample of Danish individuals (n = 20,917) and a population-based cohort of 15-year-old British adolescents (n = 2,418). Our discovery analysis identified SNPs at three loci known to be associated with BMI with genome-wide confidence (P{<}5×10−8; FTO, MC4R and FAIM2). We also found strong evidence of association at the known TMEM18, GNPDA2, SEC16B, TFAP2B, SH2B1 and KCTD15 loci (p{<}0.001), and nominal association (p{<0.05) at a further 8 loci known to be associated with BMI. However, meta-analyses of our discovery and replication cohorts identified no novel associations. Significance Our results indicate that the detectable genetic variation associated with extreme overweight is very similar to that previously found for general BMI. This suggests that population-based study designs with enriched sampling of individuals with the extreme phenotype may be an efficient method for identifying common variants that influence quantitative traits and a valid alternative to genotyping all individuals in large population-based studies, which may require tens of thousands of subjects to achieve similar power.
  • St Pourcain, B., Mandy, W. P., Heron, J., Golding, J., Davey Smith, G., & Skuse, D. H. (2011). Links between co-occurring social-communication and hyperactive-inattentive trait trajectories. Journal of the American Academy of Child & Adolescent Psychiatry, 50(9), 892-902.e5. doi:10.1016/j.jaac.2011.05.015.

    Abstract

    OBJECTIVE: There is overlap between an autistic and hyperactive-inattentive symptomatology when studied cross-sectionally. This study is the first to examine the longitudinal pattern of association between social-communication deficits and hyperactive-inattentive symptoms in the general population, from childhood through adolescence. We explored the interrelationship between trajectories of co-occurring symptoms, and sought evidence for shared prenatal/perinatal risk factors. METHOD: Study participants were 5,383 singletons of white ethnicity from the Avon Longitudinal Study of Parents and Children (ALSPAC). Multiple measurements of hyperactive-inattentive traits (Strengths and Difficulties Questionnaire) and autistic social-communication impairment (Social Communication Disorder Checklist) were obtained between 4 and 17 years. Both traits and their trajectories were modeled in parallel using latent class growth analysis (LCGA). Trajectory membership was subsequently investigated with respect to prenatal/perinatal risk factors. RESULTS: LCGA analysis revealed two distinct social-communication trajectories (persistently impaired versus low-risk) and four hyperactive-inattentive trait trajectories (persistently impaired, intermediate, childhood-limited and low-risk). Autistic symptoms were more stable than those of attention-deficit/hyperactivity disorder (ADHD) behaviors, which showed greater variability. Trajectories for both traits were strongly but not reciprocally interlinked, such that the majority of children with a persistent hyperactive-inattentive symptomatology also showed persistent social-communication deficits but not vice versa. Shared predictors, especially for trajectories of persistent impairment, were maternal smoking during the first trimester, which included familial effects, and a teenage pregnancy. CONCLUSIONS: Our longitudinal study reveals that a complex relationship exists between social-communication and hyperactive-inattentive traits. Patterns of association change over time, with corresponding implications for removing exclusivity criteria for ASD and ADHD, as proposed for DSM-5.
  • Glaser, B., Nikolov, I., Chubb, D., Hamshere, M. L., Segurado, R., Moskvina, V., & Holmans, P. (2007). Analyses of single marker and pairwise effects of candidate loci for rheumatoid arthritis using logistic regression and random forests. BMC Proceedings, 1(Suppl 1): 54.

    Abstract

    Using parametric and nonparametric techniques, our study investigated the presence of single locus and pairwise effects between 20 markers of the Genetic Analysis Workshop 15 (GAW15) North American Rheumatoid Arthritis Consortium (NARAC) candidate gene data set (Problem 2), analyzing 463 independent patients and 855 controls. Specifically, our work examined the correspondence between logistic regression (LR) analysis of single-locus and pairwise interaction effects, and random forest (RF) single and joint importance measures. For this comparison, we selected small but stable RFs (500 trees), which showed strong correlations (r~0.98) between their importance measures and those by RFs grown on 5000 trees. Both RF importance measures captured most of the LR single-locus and pairwise interaction effects, while joint importance measures also corresponded to full LR models containing main and interaction effects. We furthermore showed that RF measures were particularly sensitive to data imputation. The most consistent pairwise effect on rheumatoid arthritis was found between two markers within MAP3K7IP2/SUMO4 on 6q25.1, although LR and RFs assigned different significance levels. Within a hypothetical two-stage design, pairwise LR analysis of all markers with significant RF single importance would have reduced the number of possible combinations in our small data set by 61%, whereas joint importance measures would have been less efficient for marker pair reduction. This suggests that RF single importance measures, which are able to detect a wide range of interaction effects and are computationally very efficient, might be exploited as pre-screening tool for larger association studies. Follow-up analysis, such as by LR, is required since RFs do not indicate highrisk genotype combinations.
  • Hamshere, M. L., Segurado, R., Moskvina, V., Nikolov, I., Glaser, B., & Holmans, P. A. (2007). Large-scale linkage analysis of 1302 affected relative pairs with rheumatoid arthritis. BMC Proceedings, 1 (Suppl 1), S100.

    Abstract

    Rheumatoid arthritis is the most common systematic autoimmune disease and its etiology is believed to have both strong genetic and environmental components. We demonstrate the utility of including genetic and clinical phenotypes as covariates within a linkage analysis framework to search for rheumatoid arthritis susceptibility loci. The raw genotypes of 1302 affected relative pairs were combined from four large family-based samples (North American Rheumatoid Arthritis Consortium, United Kingdom, European Consortium on Rheumatoid Arthritis Families, and Canada). The familiality of the clinical phenotypes was assessed. The affected relative pairs were subjected to autosomal multipoint affected relative-pair linkage analysis. Covariates were included in the linkage analysis to take account of heterogeneity within the sample. Evidence of familiality was observed with age at onset (p <} 0.001) and rheumatoid factor (RF) IgM (p {< 0.001), but not definite erosions (p = 0.21). Genome-wide significant evidence for linkage was observed on chromosome 6. Genome-wide suggestive evidence for linkage was observed on chromosomes 13 and 20 when conditioning on age at onset, chromosome 15 conditional on gender, and chromosome 19 conditional on RF IgM after allowing for multiple testing of covariates.
  • Segurado, R., Hamshere, M. L., Glaser, B., Nikolov, I., Moskvina, V., & Holmans, P. A. (2007). Combining linkage data sets for meta-analysis and mega-analysis: the GAW15 rheumatoid arthritis data set. BMC Proceedings, 1(Suppl 1): S104.

    Abstract

    We have used the genome-wide marker genotypes from Genetic Analysis Workshop 15 Problem 2 to explore joint evidence for genetic linkage to rheumatoid arthritis across several samples. The data consisted of four high-density genome scans on samples selected for rheumatoid arthritis. We cleaned the data, removed intermarker linkage disequilibrium, and assembled the samples onto a common genetic map using genome sequence positions as a reference for map interpolation. The individual studies were combined first at the genotype level (mega-analysis) prior to a multipoint linkage analysis on the combined sample, and second using the genome scan meta-analysis method after linkage analysis of each sample. The two approaches were compared, and give strong support to the HLA locus on chromosome 6 as a susceptibility locus. Other regions of interest include loci on chromosomes 11, 2, and 12.
  • Ziegler, A., DeStefano, A. L., König, I. R., Bardel, C., Brinza, D., Bull, S., Cai, Z., Glaser, B., Jiang, W., Lee, K. E., Li, C. X., Li, J., Li, X., Majoram, P., Meng, Y., Nicodemus, K. K., Platt, A., Schwarz, D. F., Shi, W., Shugart, Y. Y. and 7 moreZiegler, A., DeStefano, A. L., König, I. R., Bardel, C., Brinza, D., Bull, S., Cai, Z., Glaser, B., Jiang, W., Lee, K. E., Li, C. X., Li, J., Li, X., Majoram, P., Meng, Y., Nicodemus, K. K., Platt, A., Schwarz, D. F., Shi, W., Shugart, Y. Y., Stassen, H. H., Sun, Y. V., Won, S., Wang, W., Wahba, G., Zagaar, U. A., & Zhao, Z. (2007). Data mining, neural nets, trees–problems 2 and 3 of Genetic Analysis Workshop 15. Genetic Epidemiology, 31(Suppl 1), S51-S60. doi:10.1002/gepi.20280.

    Abstract

    Genome-wide association studies using thousands to hundreds of thousands of single nucleotide polymorphism (SNP) markers and region-wide association studies using a dense panel of SNPs are already in use to identify disease susceptibility genes and to predict disease risk in individuals. Because these tasks become increasingly important, three different data sets were provided for the Genetic Analysis Workshop 15, thus allowing examination of various novel and existing data mining methods for both classification and identification of disease susceptibility genes, gene by gene or gene by environment interaction. The approach most often applied in this presentation group was random forests because of its simplicity, elegance, and robustness. It was used for prediction and for screening for interesting SNPs in a first step. The logistic tree with unbiased selection approach appeared to be an interesting alternative to efficiently select interesting SNPs. Machine learning, specifically ensemble methods, might be useful as pre-screening tools for large-scale association studies because they can be less prone to overfitting, can be less computer processor time intensive, can easily include pair-wise and higher-order interactions compared with standard statistical approaches and can also have a high capability for classification. However, improved implementations that are able to deal with hundreds of thousands of SNPs at a time are required.

Share this page