Beate St Pourcain

Publications

Displaying 1 - 14 of 14
  • Black , M. H., Buitelaar , J., Charman , T., Ecker , C., Gallagher , L., Hens , K., Jones , E., Murphy , D., Sadaka, Y., Schaer , M., St Pourcain, B., Wolke , D., Bonnot-Briey , S., Bougeron , T., & Bölte , S. (2024). A conceptual framework for data harmonization in mental health using the International Classification of Functioning Disability and Health (ICF): An example with the R2D2-MH Consortium. BMJ Mental Health, 27(1): e301283. doi:10.1136/bmjment-2024-301283.

    Abstract

    Introduction Advancing research and support for neurologically diverse populations requires novel data harmonisation methods that are capable of aligning with contemporary approaches to understanding health and disability.

    Objectives We present the International Classification of Functioning, Disability and Health (ICF) as a conceptual framework to support harmonisation of mental health data and present a proof of principle within the Risk and Resilience in Developmental Diversity and Mental Health (R2D2-MH) consortium.

    Method 138 measures from various mental health datasets were linked to the ICF following the WHO’s established linking rules.

    Findings Findings support the notion that the ICF can assist in the harmonisation of mental health data. The high level of shared ICF codes provides indications of where items may be readily harmonised to develop datasets that may align more readily with contemporary approaches to understanding health and disability. Although the linking process necessarily entails an element of subjectivity, the application of established rules can increase rigour and transparency of the harmonisation process.

    Conclusions We present the first steps towards data harmonisation in mental health that is compatible with contemporary approaches in psychiatry, being more capable of capturing diversity and aligning with more transdiagnostic and neurodiversity-affirmative ways of understanding data.

    Clinical implications Our findings show promise, but future work is needed to address quantitative harmonisation. Similarly, issues related to the traditionally ‘pathophysiological’ frameworks that existing datasets are often embedded in can hinder the full potential of harmonisation based on the ICF.

    Additional information

    data supplement
  • Hegemann, L., Corfield, E. C., Askelund, A. D., Allegrini, A. G., Askeland, R. B., Ronald, A., Ask, H., St Pourcain, B., Andreassen, O. A., Hannigan, L. J., & Havdahl, A. (2024). Genetic and phenotypic heterogeneity in early neurodevelopmental traits in the Norwegian Mother, Father and Child Cohort Study. Molecular Autism, 15: 25. doi:10.1186/s13229-024-00599-0.

    Abstract

    Background
    Autism and different neurodevelopmental conditions frequently co-occur, as do their symptoms at sub-diagnostic threshold levels. Overlapping traits and shared genetic liability are potential explanations.

    Methods
    In the population-based Norwegian Mother, Father, and Child Cohort study (MoBa), we leverage item-level data to explore the phenotypic factor structure and genetic architecture underlying neurodevelopmental traits at age 3 years (N = 41,708–58,630) using maternal reports on 76 items assessing children’s motor and language development, social functioning, communication, attention, activity regulation, and flexibility of behaviors and interests.

    Results
    We identified 11 latent factors at the phenotypic level. These factors showed associations with diagnoses of autism and other neurodevelopmental conditions. Most shared genetic liabilities with autism, ADHD, and/or schizophrenia. Item-level GWAS revealed trait-specific genetic correlations with autism (items rg range = − 0.27–0.78), ADHD (items rg range = − 0.40–1), and schizophrenia (items rg range = − 0.24–0.34). We find little evidence of common genetic liability across all neurodevelopmental traits but more so for several genetic factors across more specific areas of neurodevelopment, particularly social and communication traits. Some of these factors, such as one capturing prosocial behavior, overlap with factors found in the phenotypic analyses. Other areas, such as motor development, seemed to have more heterogenous etiology, with specific traits showing a less consistent pattern of genetic correlations with each other.

    Conclusions
    These exploratory findings emphasize the etiological complexity of neurodevelopmental traits at this early age. In particular, diverse associations with neurodevelopmental conditions and genetic heterogeneity could inform follow-up work to identify shared and differentiating factors in the early manifestations of neurodevelopmental traits and their relation to autism and other neurodevelopmental conditions. This in turn could have implications for clinical screening tools and programs.
  • De Hoyos, L., Barendse, M. T., Schlag, F., Van Donkelaar, M. M. J., Verhoef, E., Shapland, C. Y., Klassmann, A., Buitelaar, J., Verhulst, B., Fisher, S. E., Rai, D., & St Pourcain, B. (2024). Structural models of genome-wide covariance identify multiple common dimensions in autism. Nature Communications, 15: 1770. doi:10.1038/s41467-024-46128-8.

    Abstract

    Common genetic variation has been associated with multiple symptoms in Autism Spectrum Disorder (ASD). However, our knowledge of shared genetic factor structures contributing to this highly heterogeneous neurodevelopmental condition is limited. Here, we developed a structural equation modelling framework to directly model genome-wide covariance across core and non-core ASD phenotypes, studying autistic individuals of European descent using a case-only design. We identified three independent genetic factors most strongly linked to language/cognition, behaviour and motor development, respectively, when studying a population-representative sample (N=5,331). These analyses revealed novel associations. For example, developmental delay in acquiring personal-social skills was inversely related to language, while developmental motor delay was linked to self-injurious behaviour. We largely confirmed the three-factorial structure in independent ASD-simplex families (N=1,946), but uncovered simplex-specific genetic overlap between behaviour and language phenotypes. Thus, the common genetic architecture in ASD is multi-dimensional and contributes, in combination with ascertainment-specific patterns, to phenotypic heterogeneity.
  • Knol, M. J., Poot, R. A., Evans, T. E., Satizabal, C. L., Mishra, A., Sargurupremraj, M., Van der Auwera, S., Duperron, M.-G., Jian, X., Hostettler, I. C., Van Dam-Nolen, D. H. K., Lamballais, S., Pawlak, M. A., Lewis, C. E., Carrion Castillo, A., Van Erp, T. G. M., Reinbold, C. S., Shin, J., Sholz, M., Håberg, A. K. Knol, M. J., Poot, R. A., Evans, T. E., Satizabal, C. L., Mishra, A., Sargurupremraj, M., Van der Auwera, S., Duperron, M.-G., Jian, X., Hostettler, I. C., Van Dam-Nolen, D. H. K., Lamballais, S., Pawlak, M. A., Lewis, C. E., Carrion Castillo, A., Van Erp, T. G. M., Reinbold, C. S., Shin, J., Sholz, M., Håberg, A. K., Kämpe, A., Li, G. H. Y., Avinun, R., Atkins, J. R., Hsu, F.-C., Amod, A. R., Lam, M., Tsuchida, A., Teunissen, M. W. A., Aygün, N., Patel, Y., Liang, D., Beiser, A. S., Beyer, F., Bis, J. C., Bos, D., Bryan, R. N., Bülow, R., Caspers, S., Catheline, G., Cecil, C. A. M., Dalvie, S., Dartigues, J.-F., DeCarli, C., Enlund-Cerullo, M., Ford, J. M., Franke, B., Freedman, B. I., Friedrich, N., Green, M. J., Haworth, S., Helmer, C., Hoffmann, P., Homuth, G., Ikram, M. K., Jack, C. R., Jahanshad, N., Jockwitz, C., Kamatani, Y., Knodt, A. R., Li, S., Lim, K., Longstreth, W. T., Macciardi, F., The Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium, The Enhancing Neuroimaging Genetics through Meta-Analysis (ENIGMA) Consortium, Mäkitie, O., Mazoyer, B., Medland, S. E., Miyamoto, S., Moebus, S., Mosley, T. H., Muetzel, R., Mühleisen, T. W., Nagata, M., Nakahara, S., Palmer, N. D., Pausova, Z., Preda, A., Quidé, Y., Reay, W. R., Roshchupkin, G. V., Schmidt, R., Schreiner, P. J., Setoh, K., Shapland, C. Y., Sidney, S., St Pourcain, B., Stein, J. L., Tabara, Y., Teumer, A., Uhlmann, A., Van de Lught, A., Vernooij, M. W., Werring, D. J., Windham, B. G., Witte, A. V., Wittfeld, K., Yang, Q., Yoshida, K., Brunner, H. G., Le Grand, Q., Sim, K., Stein, D. J., Bowden, D. W., Cairns, M. J., Hariri, A. R., Cheung, C.-L., Andersson, S., Villringer, A., Paus, T., Chichon, S., Calhoun, V. D., Crivello, F., Launer, L. J., White, T., Koudstaal, P. J., Houlden, H., Fornage, M., Matsuda, F., Grabe, H. J., Ikram, M. A., Debette, S., Thompson, P. M., Seshadri, S., & Adams, H. H. H. (2024). Genetic variants for head size share genes and pathways with cancer. Cell Reports Medicine, 5(5): 101529. doi:10.1016/j.xcrm.2024.101529.

    Abstract

    The size of the human head is highly heritable, but genetic drivers of its variation within the general population remain unmapped. We perform a genome-wide association study on head size (N = 80,890) and identify 67 genetic loci, of which 50 are novel. Neuroimaging studies show that 17 variants affect specific brain areas, but most have widespread effects. Gene set enrichment is observed for various cancers and the p53, Wnt, and ErbB signaling pathways. Genes harboring lead variants are enriched for macrocephaly syndrome genes (37-fold) and high-fidelity cancer genes (9-fold), which is not seen for human height variants. Head size variants are also near genes preferentially expressed in intermediate progenitor cells, neural cells linked to evolutionary brain expansion. Our results indicate that genes regulating early brain and cranial growth incline to neoplasia later in life, irrespective of height. This warrants investigation of clinical implications of the link between head size and cancer.

    Additional information

    link to supplemental information
  • Verhoef, E., Allegrini, A. G., Jansen, P. R., Lange, K., Wang, C. A., Morgan, A. T., Ahluwalia, T. S., Symeonides, C., EAGLE-Working Group, Eising, E., Franken, M.-C., Hypponen, E., Mansell, T., Olislagers, M., Omerovic, E., Rimfeld, K., Schlag, F., Selzam, S., Shapland, C. Y., Tiemeier, H., Whitehouse, A. J. O. Verhoef, E., Allegrini, A. G., Jansen, P. R., Lange, K., Wang, C. A., Morgan, A. T., Ahluwalia, T. S., Symeonides, C., EAGLE-Working Group, Eising, E., Franken, M.-C., Hypponen, E., Mansell, T., Olislagers, M., Omerovic, E., Rimfeld, K., Schlag, F., Selzam, S., Shapland, C. Y., Tiemeier, H., Whitehouse, A. J. O., Saffery, R., Bønnelykke, K., Reilly, S., Pennell, C. E., Wake, M., Cecil, C. A., Plomin, R., Fisher, S. E., & St Pourcain, B. (2024). Genome-wide analyses of vocabulary size in infancy and toddlerhood: Associations with Attention-Deficit/Hyperactivity Disorder and cognition-related traits. Biological Psychiatry, 95(1), 859-869. doi:10.1016/j.biopsych.2023.11.025.

    Abstract

    Background

    The number of words children produce (expressive vocabulary) and understand (receptive vocabulary) changes rapidly during early development, partially due to genetic factors. Here, we performed a meta–genome-wide association study of vocabulary acquisition and investigated polygenic overlap with literacy, cognition, developmental phenotypes, and neurodevelopmental conditions, including attention-deficit/hyperactivity disorder (ADHD).

    Methods

    We studied 37,913 parent-reported vocabulary size measures (English, Dutch, Danish) for 17,298 children of European descent. Meta-analyses were performed for early-phase expressive (infancy, 15–18 months), late-phase expressive (toddlerhood, 24–38 months), and late-phase receptive (toddlerhood, 24–38 months) vocabulary. Subsequently, we estimated single nucleotide polymorphism–based heritability (SNP-h2) and genetic correlations (rg) and modeled underlying factor structures with multivariate models.

    Results

    Early-life vocabulary size was modestly heritable (SNP-h2 = 0.08–0.24). Genetic overlap between infant expressive and toddler receptive vocabulary was negligible (rg = 0.07), although each measure was moderately related to toddler expressive vocabulary (rg = 0.69 and rg = 0.67, respectively), suggesting a multifactorial genetic architecture. Both infant and toddler expressive vocabulary were genetically linked to literacy (e.g., spelling: rg = 0.58 and rg = 0.79, respectively), underlining genetic similarity. However, a genetic association of early-life vocabulary with educational attainment and intelligence emerged only during toddlerhood (e.g., receptive vocabulary and intelligence: rg = 0.36). Increased ADHD risk was genetically associated with larger infant expressive vocabulary (rg = 0.23). Multivariate genetic models in the ALSPAC (Avon Longitudinal Study of Parents and Children) cohort confirmed this finding for ADHD symptoms (e.g., at age 13; rg = 0.54) but showed that the association effect reversed for toddler receptive vocabulary (rg = −0.74), highlighting developmental heterogeneity.

    Conclusions

    The genetic architecture of early-life vocabulary changes during development, shaping polygenic association patterns with later-life ADHD, literacy, and cognition-related traits.
  • Nivard, M. G., Gage, S. H., Hottenga, J. J., van Beijsterveldt, C. E. M., Abdellaoui, A., Bartels, M., Baselmans, B. M. L., Ligthart, L., St Pourcain, B., Boomsma, D. I., Munafò, M. R., & Middeldorp, C. M. (2017). Genetic overlap between schizophrenia and developmental psychopathology: Longitudinal and multivariate polygenic risk prediction of common psychiatric traits during development. Schizophrenia Bulletin, 43(6), 1197-1207. doi:10.1093/schbul/sbx031.

    Abstract

    Background: Several nonpsychotic psychiatric disorders in childhood and adolescence can precede the onset of schizophrenia, but the etiology of this relationship remains unclear. We investigated to what extent the association between schizophrenia and psychiatric disorders in childhood is explained by correlated genetic risk factors. Methods: Polygenic risk scores (PRS), reflecting an individual’s genetic risk for schizophrenia, were constructed for 2588 children from the Netherlands Twin Register (NTR) and 6127 from the Avon Longitudinal Study of Parents And Children (ALSPAC). The associations between schizophrenia PRS and measures of anxiety, depression, attention deficit hyperactivity disorder (ADHD), and oppositional defiant disorder/conduct disorder (ODD/CD) were estimated at age 7, 10, 12/13, and 15 years in the 2 cohorts. Results were then meta-analyzed, and a meta-regression analysis was performed to test differences in effects sizes over, age and disorders. Results: Schizophrenia PRS were associated with childhood and adolescent psychopathology. Meta-regression analysis showed differences in the associations over disorders, with the strongest association with childhood and adolescent depression and a weaker association for ODD/CD at age 7. The associations increased with age and this increase was steepest for ADHD and ODD/CD. Genetic correlations varied between 0.10 and 0.25. Conclusion: By optimally using longitudinal data across diagnoses in a multivariate meta-analysis this study sheds light on the development of childhood disorders into severe adult psychiatric disorders. The results are consistent with a common genetic etiology of schizophrenia and developmental psychopathology as well as with a stronger shared genetic etiology between schizophrenia and adolescent onset psychopathology.
  • Nivard, M. G., Lubke, G. H., Dolan, C. V., Evans, D. M., St Pourcain, B., Munafo, M. R., & Middeldorp, C. M. (2017). Joint developmental trajectories of internalizing and externalizing disorders between childhood and adolescence. Development and Psychopathology, 29(3), 919-928. doi:10.1017/S0954579416000572.

    Abstract

    This study sought to identify trajectories of DSM-IV based internalizing (INT) and externalizing (EXT) problem scores across childhood and adolescence and to provide insight into the comorbidity by modeling the co-occurrence of INT and EXT trajectories. INT and EXT were measured repeatedly between age 7 and age 15 years in over 7,000 children and analyzed using growth mixture models. Five trajectories were identified for both INT and EXT, including very low, low, decreasing, and increasing trajectories. In addition, an adolescent onset trajectory was identified for INT and a stable high trajectory was identified for EXT. Multinomial regression showed that similar EXT and INT trajectories were associated. However, the adolescent onset INT trajectory was independent of high EXT trajectories, and persisting EXT was mainly associated with decreasing INT. Sex and early life environmental risk factors predicted EXT and, to a lesser extent, INT trajectories. The association between trajectories indicates the need to consider comorbidity when a child presents with INT or EXT disorders, particularly when symptoms start early. This is less necessary when INT symptoms start at adolescence. Future studies should investigate the etiology of co-occurring INT and EXT and the specific treatment needs of these severely affected children.
  • Stergiakouli, E., Martin, J., Hamshere, M. L., Heron, J., St Pourcain, B., Timpson, N. J., Thapar, A., & Smith, G. D. (2017). Association between polygenic risk scores for attention-deficit hyperactivity disorder and educational and cognitive outcomes in the general population. International Journal of Epidemiology, 46(2), 421-428. doi:10.1093/ije/dyw216.

    Abstract

    Background: Children with a diagnosis of attention-deficit hyperactivity disorder (ADHD) have lower cognitive ability and are at risk of adverse educational outcomes; ADHD genetic risks have been found to predict childhood cognitive ability and other neurodevelopmental traits in the general population; thus genetic risks might plausibly also contribute to cognitive ability later in development and to educational underachievement.

    Methods: We generated ADHD polygenic risk scores in the Avon Longitudinal Study of Parents and Children participants (maximum N: 6928 children and 7280 mothers) based on the results of a discovery clinical sample, a genome-wide association study of 727 cases with ADHD diagnosis and 5081 controls. We tested if ADHD polygenic risk scores were associated with educational outcomes and IQ in adolescents and their mothers.

    Results: High ADHD polygenic scores in adolescents were associated with worse educational outcomes at Key Stage 3 [national tests conducted at age 13–14 years; β = −1.4 (−2.0 to −0.8), P = 2.3 × 10−6), at General Certificate of Secondary Education exams at age 15–16 years (β = −4.0 (−6.1 to −1.9), P = 1.8 × 10−4], reduced odds of sitting Key Stage 5 examinations at age 16–18 years [odds ratio (OR) = 0.90 (0.88 to 0.97), P = 0.001] and lower IQ scores at age 15.5 [β = −0.8 (−1.2 to −0.4), P = 2.4 × 10−4]. Moreover, maternal ADHD polygenic scores were associated with lower maternal educational achievement [β = −0.09 (−0.10 to −0.06), P = 0.005] and lower maternal IQ [β = −0.6 (−1.2 to −0.1), P = 0.03].

    Conclusions: ADHD diagnosis risk alleles impact on functional outcomes in two generations (mother and child) and likely have intergenerational environmental effects.
  • Stergiakouli, E., Smith, G. D., Martin, J., Skuse, D. H., Viechtbauer, W., Ring, S. M., Ronald, A., Evans, D. E., Fisher, S. E., Thapar, A., & St Pourcain, B. (2017). Shared genetic influences between dimensional ASD and ADHD symptoms during child and adolescent development. Molecular Autism, 8: 18. doi:10.1186/s13229-017-0131-2.

    Abstract

    Background: Shared genetic influences between attention-deficit/hyperactivity disorder (ADHD) symptoms and
    autism spectrum disorder (ASD) symptoms have been reported. Cross-trait genetic relationships are, however,
    subject to dynamic changes during development. We investigated the continuity of genetic overlap between ASD
    and ADHD symptoms in a general population sample during childhood and adolescence. We also studied uni- and
    cross-dimensional trait-disorder links with respect to genetic ADHD and ASD risk.
    Methods: Social-communication difficulties (N ≤ 5551, Social and Communication Disorders Checklist, SCDC) and
    combined hyperactive-impulsive/inattentive ADHD symptoms (N ≤ 5678, Strengths and Difficulties Questionnaire,
    SDQ-ADHD) were repeatedly measured in a UK birth cohort (ALSPAC, age 7 to 17 years). Genome-wide summary
    statistics on clinical ASD (5305 cases; 5305 pseudo-controls) and ADHD (4163 cases; 12,040 controls/pseudo-controls)
    were available from the Psychiatric Genomics Consortium. Genetic trait variances and genetic overlap between
    phenotypes were estimated using genome-wide data.
    Results: In the general population, genetic influences for SCDC and SDQ-ADHD scores were shared throughout
    development. Genetic correlations across traits reached a similar strength and magnitude (cross-trait rg ≤ 1,
    pmin = 3 × 10−4) as those between repeated measures of the same trait (within-trait rg ≤ 0.94, pmin = 7 × 10−4).
    Shared genetic influences between traits, especially during later adolescence, may implicate variants in K-RAS signalling
    upregulated genes (p-meta = 6.4 × 10−4).
    Uni-dimensionally, each population-based trait mapped to the expected behavioural continuum: risk-increasing alleles
    for clinical ADHD were persistently associated with SDQ-ADHD scores throughout development (marginal regression
    R2 = 0.084%). An age-specific genetic overlap between clinical ASD and social-communication difficulties during
    childhood was also shown, as per previous reports. Cross-dimensionally, however, neither SCDC nor SDQ-ADHD scores
    were linked to genetic risk for disorder.
    Conclusions: In the general population, genetic aetiologies between social-communication difficulties and ADHD
    symptoms are shared throughout child and adolescent development and may implicate similar biological pathways
    that co-vary during development. Within both the ASD and the ADHD dimension, population-based traits are also linked
    to clinical disorder, although much larger clinical discovery samples are required to reliably detect cross-dimensional
    trait-disorder relationships.
  • Tachmazidou, I., Süveges, D., Min, J. L., Ritchie, G. R. S., Steinberg, J., Walter, K., Iotchkova, V., Schwartzentruber, J., Huang, J., Memari, Y., McCarthy, S., Crawford, A. A., Bombieri, C., Cocca, M., Farmaki, A.-E., Gaunt, T. R., Jousilahti, P., Kooijman, M. N., Lehne, B., Malerba, G. and 83 moreTachmazidou, I., Süveges, D., Min, J. L., Ritchie, G. R. S., Steinberg, J., Walter, K., Iotchkova, V., Schwartzentruber, J., Huang, J., Memari, Y., McCarthy, S., Crawford, A. A., Bombieri, C., Cocca, M., Farmaki, A.-E., Gaunt, T. R., Jousilahti, P., Kooijman, M. N., Lehne, B., Malerba, G., Männistö, S., Matchan, A., Medina-Gomez, C., Metrustry, S. J., Nag, A., Ntalla, I., Paternoster, L., Rayner, N. W., Sala, C., Scott, W. R., Shihab, H. A., Southam, L., St Pourcain, B., Traglia, M., Trajanoska, K., Zaza, G., Zhang, W., Artigas, M. S., Bansal, N., Benn, M., Chen, Z., Danecek, P., Lin, W.-Y., Locke, A., Luan, J., Manning, A. K., Mulas, A., Sidore, C., Tybjaerg-Hansen, A., Varbo, A., Zoledziewska, M., Finan, C., Hatzikotoulas, K., Hendricks, A. E., Kemp, J. P., Moayyeri, A., Panoutsopoulou, K., Szpak, M., Wilson, S. G., Boehnke, M., Cucca, F., Di Angelantonio, E., Langenberg, C., Lindgren, C., McCarthy, M. I., Morris, A. P., Nordestgaard, B. G., Scott, R. A., Tobin, M. D., Wareham, N. J., Burton, P., Chambers, J. C., Smith, G. D., Dedoussis, G., Felix, J. F., Franco, O. H., Gambaro, G., Gasparini, P., Hammond, C. J., Hofman, A., Jaddoe, V. W. V., Kleber, M., Kooner, J. S., Perola, M., Relton, C., Ring, S. M., Rivadeneira, F., Salomaa, V., Spector, T. D., Stegle, O., Toniolo, D., Uitterlinden, A. G., Barroso, I., Greenwood, C. M. T., Perry, J. R. B., Walker, B. R., Butterworth, A. S., Xue, Y., Durbin, R., Small, K. S., Soranzo, N., Timpson, N. J., & Zeggini, E. (2017). Whole-Genome Sequencing coupled to imputation discovers genetic signals for anthropometric traits. The American Journal of Human Genetics, 100(6), 865-884. doi:10.1016/j.ajhg.2017.04.014.

    Abstract

    Deep sequence-based imputation can enhance the discovery power of genome-wide association studies by assessing previously unexplored variation across the common- and low-frequency spectra. We applied a hybrid whole-genome sequencing (WGS) and deep imputation approach to examine the broader allelic architecture of 12 anthropometric traits associated with height, body mass, and fat distribution in up to 267,616 individuals. We report 106 genome-wide significant signals that have not been previously identified, including 9 low-frequency variants pointing to functional candidates. Of the 106 signals, 6 are in genomic regions that have not been implicated with related traits before, 28 are independent signals at previously reported regions, and 72 represent previously reported signals for a different anthropometric trait. 71% of signals reside within genes and fine mapping resolves 23 signals to one or two likely causal variants. We confirm genetic overlap between human monogenic and polygenic anthropometric traits and find signal enrichment in cis expression QTLs in relevant tissues. Our results highlight the potential of WGS strategies to enhance biologically relevant discoveries across the frequency spectrum.
  • Glaser, B., Nikolov, I., Chubb, D., Hamshere, M. L., Segurado, R., Moskvina, V., & Holmans, P. (2007). Analyses of single marker and pairwise effects of candidate loci for rheumatoid arthritis using logistic regression and random forests. BMC Proceedings, 1(Suppl 1): 54.

    Abstract

    Using parametric and nonparametric techniques, our study investigated the presence of single locus and pairwise effects between 20 markers of the Genetic Analysis Workshop 15 (GAW15) North American Rheumatoid Arthritis Consortium (NARAC) candidate gene data set (Problem 2), analyzing 463 independent patients and 855 controls. Specifically, our work examined the correspondence between logistic regression (LR) analysis of single-locus and pairwise interaction effects, and random forest (RF) single and joint importance measures. For this comparison, we selected small but stable RFs (500 trees), which showed strong correlations (r~0.98) between their importance measures and those by RFs grown on 5000 trees. Both RF importance measures captured most of the LR single-locus and pairwise interaction effects, while joint importance measures also corresponded to full LR models containing main and interaction effects. We furthermore showed that RF measures were particularly sensitive to data imputation. The most consistent pairwise effect on rheumatoid arthritis was found between two markers within MAP3K7IP2/SUMO4 on 6q25.1, although LR and RFs assigned different significance levels. Within a hypothetical two-stage design, pairwise LR analysis of all markers with significant RF single importance would have reduced the number of possible combinations in our small data set by 61%, whereas joint importance measures would have been less efficient for marker pair reduction. This suggests that RF single importance measures, which are able to detect a wide range of interaction effects and are computationally very efficient, might be exploited as pre-screening tool for larger association studies. Follow-up analysis, such as by LR, is required since RFs do not indicate highrisk genotype combinations.
  • Hamshere, M. L., Segurado, R., Moskvina, V., Nikolov, I., Glaser, B., & Holmans, P. A. (2007). Large-scale linkage analysis of 1302 affected relative pairs with rheumatoid arthritis. BMC Proceedings, 1 (Suppl 1), S100.

    Abstract

    Rheumatoid arthritis is the most common systematic autoimmune disease and its etiology is believed to have both strong genetic and environmental components. We demonstrate the utility of including genetic and clinical phenotypes as covariates within a linkage analysis framework to search for rheumatoid arthritis susceptibility loci. The raw genotypes of 1302 affected relative pairs were combined from four large family-based samples (North American Rheumatoid Arthritis Consortium, United Kingdom, European Consortium on Rheumatoid Arthritis Families, and Canada). The familiality of the clinical phenotypes was assessed. The affected relative pairs were subjected to autosomal multipoint affected relative-pair linkage analysis. Covariates were included in the linkage analysis to take account of heterogeneity within the sample. Evidence of familiality was observed with age at onset (p <} 0.001) and rheumatoid factor (RF) IgM (p {< 0.001), but not definite erosions (p = 0.21). Genome-wide significant evidence for linkage was observed on chromosome 6. Genome-wide suggestive evidence for linkage was observed on chromosomes 13 and 20 when conditioning on age at onset, chromosome 15 conditional on gender, and chromosome 19 conditional on RF IgM after allowing for multiple testing of covariates.
  • Segurado, R., Hamshere, M. L., Glaser, B., Nikolov, I., Moskvina, V., & Holmans, P. A. (2007). Combining linkage data sets for meta-analysis and mega-analysis: the GAW15 rheumatoid arthritis data set. BMC Proceedings, 1(Suppl 1): S104.

    Abstract

    We have used the genome-wide marker genotypes from Genetic Analysis Workshop 15 Problem 2 to explore joint evidence for genetic linkage to rheumatoid arthritis across several samples. The data consisted of four high-density genome scans on samples selected for rheumatoid arthritis. We cleaned the data, removed intermarker linkage disequilibrium, and assembled the samples onto a common genetic map using genome sequence positions as a reference for map interpolation. The individual studies were combined first at the genotype level (mega-analysis) prior to a multipoint linkage analysis on the combined sample, and second using the genome scan meta-analysis method after linkage analysis of each sample. The two approaches were compared, and give strong support to the HLA locus on chromosome 6 as a susceptibility locus. Other regions of interest include loci on chromosomes 11, 2, and 12.
  • Ziegler, A., DeStefano, A. L., König, I. R., Bardel, C., Brinza, D., Bull, S., Cai, Z., Glaser, B., Jiang, W., Lee, K. E., Li, C. X., Li, J., Li, X., Majoram, P., Meng, Y., Nicodemus, K. K., Platt, A., Schwarz, D. F., Shi, W., Shugart, Y. Y. and 7 moreZiegler, A., DeStefano, A. L., König, I. R., Bardel, C., Brinza, D., Bull, S., Cai, Z., Glaser, B., Jiang, W., Lee, K. E., Li, C. X., Li, J., Li, X., Majoram, P., Meng, Y., Nicodemus, K. K., Platt, A., Schwarz, D. F., Shi, W., Shugart, Y. Y., Stassen, H. H., Sun, Y. V., Won, S., Wang, W., Wahba, G., Zagaar, U. A., & Zhao, Z. (2007). Data mining, neural nets, trees–problems 2 and 3 of Genetic Analysis Workshop 15. Genetic Epidemiology, 31(Suppl 1), S51-S60. doi:10.1002/gepi.20280.

    Abstract

    Genome-wide association studies using thousands to hundreds of thousands of single nucleotide polymorphism (SNP) markers and region-wide association studies using a dense panel of SNPs are already in use to identify disease susceptibility genes and to predict disease risk in individuals. Because these tasks become increasingly important, three different data sets were provided for the Genetic Analysis Workshop 15, thus allowing examination of various novel and existing data mining methods for both classification and identification of disease susceptibility genes, gene by gene or gene by environment interaction. The approach most often applied in this presentation group was random forests because of its simplicity, elegance, and robustness. It was used for prediction and for screening for interesting SNPs in a first step. The logistic tree with unbiased selection approach appeared to be an interesting alternative to efficiently select interesting SNPs. Machine learning, specifically ensemble methods, might be useful as pre-screening tools for large-scale association studies because they can be less prone to overfitting, can be less computer processor time intensive, can easily include pair-wise and higher-order interactions compared with standard statistical approaches and can also have a high capability for classification. However, improved implementations that are able to deal with hundreds of thousands of SNPs at a time are required.

Share this page