Beate St Pourcain

Publications

Displaying 1 - 11 of 11
  • Guggenheim, J. A., Northstone, K., McMahon, G., Ness, A. R., Deere, K., Mattocks, C., St Pourcain, B., & Williams, C. (2012). Time outdoors and physical activity as predictors of incident myopia in childhood: a prospective cohort study. Investigative Ophthalmology and Visual Science, 53(6), 2856-2865. doi:10.1167/iovs.11-9091.

    Abstract

    PURPOSE: Time spent in "sports/outdoor activity" has shown a negative association with incident myopia during childhood. We investigated the association of incident myopia with time spent outdoors and physical activity separately. METHODS: Participants in the Avon Longitudinal Study of Parents and Children (ALSPAC) were assessed by noncycloplegic autorefraction at ages 7, 10, 11, 12, and 15 years, and classified as myopic (≤-1 diopters) or as emmetropic/hyperopic (≥-0.25 diopters) at each visit (N = 4,837-7,747). Physical activity at age 11 years was measured objectively using an accelerometer, worn for 1 week. Time spent outdoors was assessed via a parental questionnaire administered when children were aged 8-9 years. Variables associated with incident myopia were examined using Cox regression. RESULTS: In analyses using all available data, both time spent outdoors and physical activity were associated with incident myopia, with time outdoors having the larger effect. The results were similar for analyses restricted to children classified as either nonmyopic or emmetropic/hyperopic at age 11 years. Thus, for children nonmyopic at age 11, the hazard ratio (95% confidence interval, CI) for incident myopia was 0.66 (0.47-0.93) for a high versus low amount of time spent outdoors, and 0.87 (0.76-0.99) per unit standard deviation above average increase in moderate/vigorous physical activity. CONCLUSION: Time spent outdoors was predictive of incident myopia independently of physical activity level. The greater association observed for time outdoors suggests that the previously reported link between "sports/outdoor activity" and incident myopia is due mainly to its capture of information relating to time outdoors rather than physical activity.
  • Ikram, M. A., Fornage, M., Smith, A. V., Seshadri, S., Schmidt, R., Debette, S., Vrooman, H. A., Sigurdsson, S., Ropele, S., Taal, H. R., Mook-Kanamori, D. O., Coker, L. H., Longstreth, W. T., Niessen, W. J., DeStefano, A. L., Beiser, A., Zijdenbos, A. P., Struchalin, M., Jack, C. R., Rivadeneira, F. and 37 moreIkram, M. A., Fornage, M., Smith, A. V., Seshadri, S., Schmidt, R., Debette, S., Vrooman, H. A., Sigurdsson, S., Ropele, S., Taal, H. R., Mook-Kanamori, D. O., Coker, L. H., Longstreth, W. T., Niessen, W. J., DeStefano, A. L., Beiser, A., Zijdenbos, A. P., Struchalin, M., Jack, C. R., Rivadeneira, F., Uitterlinden, A. G., Knopman, D. S., Hartikainen, A.-L., Pennell, C. E., Thiering, E., Steegers, E. A. P., Hakonarson, H., Heinrich, J., Palmer, L. J., Jarvelin, M.-R., McCarthy, M. I., Grant, S. F. A., St Pourcain, B., Timpson, N. J., Smith, G. D., Sovio, U., Nalls, M. A., Au, R., Hofman, A., Gudnason, H., van der Lugt, A., Harris, T. B., Meeks, W. M., Vernooij, M. W., van Buchem, M. A., Catellier, D., Jaddoe, V. W. V., Gudnason, V., Windham, B. G., Wolf, P. A., van Duijn, C. M., Mosley, T. H., Schmidt, H., Launer, L. J., Breteler, M. M. B., DeCarli, C., Consortiumthe Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium, & Early Growth Genetics (EGG) Consortium (2012). Common variants at 6q22 and 17q21 are associated with intracranial volume. Nature Genetics, 44(5), 539-544. doi:10.1038/ng.2245.

    Abstract

    During aging, intracranial volume remains unchanged and represents maximally attained brain size, while various interacting biological phenomena lead to brain volume loss. Consequently, intracranial volume and brain volume in late life reflect different genetic influences. Our genome-wide association study (GWAS) in 8,175 community-dwelling elderly persons did not reveal any associations at genome-wide significance (P < 5 × 10(-8)) for brain volume. In contrast, intracranial volume was significantly associated with two loci: rs4273712 (P = 3.4 × 10(-11)), a known height-associated locus on chromosome 6q22, and rs9915547 (P = 1.5 × 10(-12)), localized to the inversion on chromosome 17q21. We replicated the associations of these loci with intracranial volume in a separate sample of 1,752 elderly persons (P = 1.1 × 10(-3) for 6q22 and 1.2 × 10(-3) for 17q21). Furthermore, we also found suggestive associations of the 17q21 locus with head circumference in 10,768 children (mean age of 14.5 months). Our data identify two loci associated with head size, with the inversion at 17q21 also likely to be involved in attaining maximal brain size.
  • Paternoster, L., Zhurov, A., Toma, A., Kemp, J., St Pourcain, B., Timpson, N., McMahon, G., McArdle, W., Ring, S., Smith, G., Richmond, S., & Evans, D. (2012). Genome-wide Association Study of Three-Dimensional Facial Morphology Identifies a Variant in PAX3 Associated with Nasion Position. The American Journal of Human Genetics, 90(3), 478-485. doi:10.1016/j.ajhg.2011.12.021.

    Abstract

    Craniofacial morphology is highly heritable, but little is known about which genetic variants influence normal facial variation in the general population. We aimed to identify genetic variants associated with normal facial variation in a population-based cohort of 15-year-olds from the Avon Longitudinal Study of Parents and Children. 3D high-resolution images were obtained with two laser scanners, these were merged and aligned, and 22 landmarks were identified and their x, y, and z coordinates used to generate 54 3D distances reflecting facial features. 14 principal components (PCs) were also generated from the landmark locations. We carried out genome-wide association analyses of these distances and PCs in 2,185 adolescents and attempted to replicate any significant associations in a further 1,622 participants. In the discovery analysis no associations were observed with the PCs, but we identified four associations with the distances, and one of these, the association between rs7559271 in PAX3 and the nasion to midendocanthion distance (n-men), was replicated (p = 4 × 10−7). In a combined analysis, each G allele of rs7559271 was associated with an increase in n-men distance of 0.39 mm (p = 4 × 10−16), explaining 1.3% of the variance. Independent associations were observed in both the z (nasion prominence) and y (nasion height) dimensions (p = 9 × 10−9 and p = 9 × 10−10, respectively), suggesting that the locus primarily influences growth in the yz plane. Rare variants in PAX3 are known to cause Waardenburg syndrome, which involves deafness, pigmentary abnormalities, and facial characteristics including a broad nasal bridge. Our findings show that common variants within this gene also influence normal craniofacial development.
  • Relton, C. L., Groom, A., St Pourcain, B., Sayers, A. E., Swan, D. C., Embleton, N. D., Pearce, M. S., Ring, S. M., Northstone, K., Tobias, J. H., Trakalo, J., Ness, A. R., Shaheen, S. O., & Davey Smith, G. (2012). DNA Methylation Patterns in Cord Blood DNA and Body Size in Childhood. PLoS ONE, 7(3): e31821. doi:10.1371/journal.pone.0031821.

    Abstract

    BACKGROUND: Epigenetic markings acquired in early life may have phenotypic consequences later in development through their role in transcriptional regulation with relevance to the developmental origins of diseases including obesity. The goal of this study was to investigate whether DNA methylation levels at birth are associated with body size later in childhood. PRINCIPAL FINDINGS: A study design involving two birth cohorts was used to conduct transcription profiling followed by DNA methylation analysis in peripheral blood. Gene expression analysis was undertaken in 24 individuals whose biological samples and clinical data were collected at a mean ± standard deviation (SD) age of 12.35 (0.95) years, the upper and lower tertiles of body mass index (BMI) were compared with a mean (SD) BMI difference of 9.86 (2.37) kg/m(2). This generated a panel of differentially expressed genes for DNA methylation analysis which was then undertaken in cord blood DNA in 178 individuals with body composition data prospectively collected at a mean (SD) age of 9.83 (0.23) years. Twenty-nine differentially expressed genes (>}1.2-fold and p{<10(-4)) were analysed to determine DNA methylation levels at 1-3 sites per gene. Five genes were unmethylated and DNA methylation in the remaining 24 genes was analysed using linear regression with bootstrapping. Methylation in 9 of the 24 (37.5%) genes studied was associated with at least one index of body composition (BMI, fat mass, lean mass, height) at age 9 years, although only one of these associations remained after correction for multiple testing (ALPL with height, p(Corrected) = 0.017). CONCLUSIONS: DNA methylation patterns in cord blood show some association with altered gene expression, body size and composition in childhood. The observed relationship is correlative and despite suggestion of a mechanistic epigenetic link between in utero life and later phenotype, further investigation is required to establish causality.
  • Scott, R. A., Lagou, V., Welch, R. P., Wheeler, E., Montasser, M. E., Luan, J., Mägi, R., Strawbridge, R. J., Rehnberg, E., Gustafsson, S., Kanoni, S., Rasmussen-Torvik, L. J., Yengo, L., Lecoeur, C., Shungin, D., Sanna, S., Sidore, C., Johnson, P. C. D., Jukema, J. W., Johnson, T. and 195 moreScott, R. A., Lagou, V., Welch, R. P., Wheeler, E., Montasser, M. E., Luan, J., Mägi, R., Strawbridge, R. J., Rehnberg, E., Gustafsson, S., Kanoni, S., Rasmussen-Torvik, L. J., Yengo, L., Lecoeur, C., Shungin, D., Sanna, S., Sidore, C., Johnson, P. C. D., Jukema, J. W., Johnson, T., Mahajan, A., Verweij, N., Thorleifsson, G., Hottenga, J.-J., Shah, S., Smith, A. V., Sennblad, B., Gieger, C., Salo, P., Perola, M., Timpson, N. J., Evans, D. M., St Pourcain, B., Wu, Y., Andrews, J. S., Hui, J., Bielak, L. F., Zhao, W., Horikoshi, M., Navarro, P., Isaacs, A., O'Connell, J. R., Stirrups, K., Vitart, V., Hayward, C., Esko, T., Mihailov, E., Fraser, R. M., Fall, T., Voight, B. F., Raychaudhuri, S., Chen, H., Lindgren, C. M., Morris, A. P., Rayner, N. W., Robertson, N., Rybin, D., Liu, C.-T., Beckmann, J. S., Willems, S. M., Chines, P. S., Jackson, A. U., Kang, H. M., Stringham, H. M., Song, K., Tanaka, T., Peden, J. F., Goel, A., Hicks, A. A., An, P., Müller-Nurasyid, M., Franco-Cereceda, A., Folkersen, L., Marullo, L., Jansen, H., Oldehinkel, A. J., Bruinenberg, M., Pankow, J. S., North, K. E., Forouhi, N. G., Loos, R. J. F., Edkins, S., Varga, T. V., Hallmans, G., Oksa, H., Antonella, M., Nagaraja, R., Trompet, S., Ford, I., Bakker, S. J. L., Kong, A., Kumari, M., Gigante, B., Herder, C., Munroe, P. B., Caulfield, M., Antti, J., Mangino, M., Small, K., Miljkovic, I., Liu, Y., Atalay, M., Kiess, W., James, A. L., Rivadeneira, F., Uitterlinden, A. G., Palmer, C. N. A., Doney, A. S. F., Willemsen, G., Smit, J. H., Campbell, S., Polasek, O., Bonnycastle, L. L., Hercberg, S., Dimitriou, M., Bolton, J. L., Fowkes, G. R., Kovacs, P., Lindström, J., Zemunik, T., Bandinelli, S., Wild, S. H., Basart, H. V., Rathmann, W., Grallert, H., Maerz, W., Kleber, M. E., Boehm, B. O., Peters, A., Pramstaller, P. P., Province, M. A., Borecki, I. B., Hastie, N. D., Rudan, I., Campbell, H., Watkins, H., Farrall, M., Stumvoll, M., Ferrucci, L., Waterworth, D. M., Bergman, R. N., Collins, F. S., Tuomilehto, J., Watanabe, R. M., de Geus, E. J. C., Penninx, B. W., Hofman, A., Oostra, B. A., Psaty, B. M., Vollenweider, P., Wilson, J. F., Wright, A. F., Hovingh, G. K., Metspalu, A., Uusitupa, M., Magnusson, P. K. E., Kyvik, K. O., Kaprio, J., Price, J. F., Dedoussis, G. V., Deloukas, P., Meneton, P., Lind, L., Boehnke, M., Shuldiner, A. R., van Duijn, C. M., Morris, A. D., Toenjes, A., Peyser, P. A., Beilby, J. P., Körner, A., Kuusisto, J., Laakso, M., Bornstein, S. R., Schwarz, P. E. H., Lakka, T. A., Rauramaa, R., Adair, L. S., Smith, G. D., Spector, T. D., Illig, T., de Faire, U., Hamsten, A., Gudnason, V., Kivimaki, M., Hingorani, A., Keinanen-Kiukaanniemi, S. M., Saaristo, T. E., Boomsma, D. I., Stefansson, K., van der Harst, P., Dupuis, J., Pedersen, N. L., Sattar, N., Harris, T. B., Cucca, F., Ripatti, S., Salomaa, V., Mohlke, K. L., Balkau, B., Froguel, P., Pouta, A., Jarvelin, M.-R., Wareham, N. J., Bouatia-Naji, N., McCarthy, M. I., Franks, P. W., Meigs, J. B., Teslovich, T. M., Florez, J. C., Langenberg, C., Ingelsson, E., Prokopenko, I., Barroso, I., & Diabetes Genetics Replication and Meta-analysis (DIAGRAM) Consortium (2012). Large-scale association analyses identify new loci influencing glycemic traits and provide insight into the underlying biological pathways. Nature Genetics, 44(9), 991-1005. doi:10.1038/ng.2385.

    Abstract

    Through genome-wide association meta-analyses of up to 133,010 individuals of European ancestry without diabetes, including individuals newly genotyped using the Metabochip, we have increased the number of confirmed loci influencing glycemic traits to 53, of which 33 also increase type 2 diabetes risk (q < 0.05). Loci influencing fasting insulin concentration showed association with lipid levels and fat distribution, suggesting impact on insulin resistance. Gene-based analyses identified further biologically plausible loci, suggesting that additional loci beyond those reaching genome-wide significance are likely to represent real associations. This conclusion is supported by an excess of directionally consistent and nominally significant signals between discovery and follow-up studies. Functional analysis of these newly discovered loci will further improve our understanding of glycemic control.
  • Taal, H. R., St Pourcain, B., Thiering, E., Das, S., Mook-Kanamori, D. O., Warrington, N. M., Kaakinen, M., Kreiner-Møller, E., Bradfield, J. P., Freathy, R. M., Geller, F., Guxens, M., Cousminer, D. L., Kerkhof, M., Timpson, N. J., Ikram, M. A., Beilin, L. J., Bønnelykke, K., Buxton, J. L., Charoen, P. and 68 moreTaal, H. R., St Pourcain, B., Thiering, E., Das, S., Mook-Kanamori, D. O., Warrington, N. M., Kaakinen, M., Kreiner-Møller, E., Bradfield, J. P., Freathy, R. M., Geller, F., Guxens, M., Cousminer, D. L., Kerkhof, M., Timpson, N. J., Ikram, M. A., Beilin, L. J., Bønnelykke, K., Buxton, J. L., Charoen, P., Chawes, B. L. K., Eriksson, J., Evans, D. M., Hofman, A., Kemp, J. P., Kim, C. E., Klopp, N., Lahti, J., Lye, S. J., McMahon, G., Mentch, F. D., Müller-Nurasyid, M., O'Reilly, P. F., Prokopenko, I., Rivadeneira, F., Steegers, E. A. P., Sunyer, J., Tiesler, C., Yaghootkar, H., Breteler, M. M. B., Decarli, C., Breteler, M. M. B., Debette, S., Fornage, M., Gudnason, V., Launer, L. J., van der Lugt, A., Mosley, T. H., Seshadri, S., Smith, A. V., Vernooij, M. W., Blakemore, A. I. F., Chiavacci, R. M., Feenstra, B., Fernandez-Banet, J., Grant, S. F. A., Hartikainen, A.-L., van der Heijden, A. J., Iñiguez, C., Lathrop, M., McArdle, W. L., Mølgaard, A., Newnham, J. P., Palmer, L. J., Palotie, A., Pouta, A., Ring, S. M., Sovio, U., Standl, M., Uitterlinden, A. G., Wichmann, H.-E., Vissing, N. H., DeCarli, C., van Duijn, C. M., McCarthy, M. I., Koppelman, G. H., Estivill, X., Hattersley, A. T., Melbye, M., Bisgaard, H., Pennell, C. E., Widen, E., Hakonarson, H., Smith, G. D., Heinrich, J., Jarvelin, M.-R., Jaddoe, V. W. V., The Cohorts for Heart and Aging Research in Genetic Epidemiology (CHARGE) Consortium, EArly Genetics and Lifecourse Epidemiology (EAGLE) Consortium, & Early Growth Genetics (EGG) Consortium (2012). Common variants at 12q15 and 12q24 are associated with infant head circumference. Nature Genetics, 44(5), 532-538. doi:10.1038/ng.2238.

    Abstract

    To identify genetic variants associated with head circumference in infancy, we performed a meta-analysis of seven genome-wide association studies (GWAS) (N = 10,768 individuals of European ancestry enrolled in pregnancy and/or birth cohorts) and followed up three lead signals in six replication studies (combined N = 19,089). rs7980687 on chromosome 12q24 (P = 8.1 × 10(-9)) and rs1042725 on chromosome 12q15 (P = 2.8 × 10(-10)) were robustly associated with head circumference in infancy. Although these loci have previously been associated with adult height, their effects on infant head circumference were largely independent of height (P = 3.8 × 10(-7) for rs7980687 and P = 1.3 × 10(-7) for rs1042725 after adjustment for infant height). A third signal, rs11655470 on chromosome 17q21, showed suggestive evidence of association with head circumference (P = 3.9 × 10(-6)). SNPs correlated to the 17q21 signal have shown genome-wide association with adult intracranial volume, Parkinson's disease and other neurodegenerative diseases, indicating that a common genetic variant in this region might link early brain growth with neurological disease in later life.
  • Williams, N. M., Williams, H., Majounie, E., Norton, N., Glaser, B., Morris, H. R., Owen, M. J., & O'Donovan, M. C. (2008). Analysis of copy number variation using quantitative interspecies competitive PCR. Nucleic Acids Research, 36(17): e112. doi:10.1093/nar/gkn495.

    Abstract

    Over recent years small submicroscopic DNA copy-number variants (CNVs) have been highlighted as an important source of variation in the human genome, human phenotypic diversity and disease susceptibility. Consequently, there is a pressing need for the development of methods that allow the efficient, accurate and cheap measurement of genomic copy number polymorphisms in clinical cohorts. We have developed a simple competitive PCR based method to determine DNA copy number which uses the entire genome of a single chimpanzee as a competitor thus eliminating the requirement for competitive sequences to be synthesized for each assay. This results in the requirement for only a single reference sample for all assays and dramatically increases the potential for large numbers of loci to be analysed in multiplex. In this study we establish proof of concept by accurately detecting previously characterized mutations at the PARK2 locus and then demonstrating the potential of quantitative interspecies competitive PCR (qicPCR) to accurately genotype CNVs in association studies by analysing chromosome 22q11 deletions in a sample of previously characterized patients and normal controls.
  • Glaser, B., Nikolov, I., Chubb, D., Hamshere, M. L., Segurado, R., Moskvina, V., & Holmans, P. (2007). Analyses of single marker and pairwise effects of candidate loci for rheumatoid arthritis using logistic regression and random forests. BMC Proceedings, 1(Suppl 1): 54.

    Abstract

    Using parametric and nonparametric techniques, our study investigated the presence of single locus and pairwise effects between 20 markers of the Genetic Analysis Workshop 15 (GAW15) North American Rheumatoid Arthritis Consortium (NARAC) candidate gene data set (Problem 2), analyzing 463 independent patients and 855 controls. Specifically, our work examined the correspondence between logistic regression (LR) analysis of single-locus and pairwise interaction effects, and random forest (RF) single and joint importance measures. For this comparison, we selected small but stable RFs (500 trees), which showed strong correlations (r~0.98) between their importance measures and those by RFs grown on 5000 trees. Both RF importance measures captured most of the LR single-locus and pairwise interaction effects, while joint importance measures also corresponded to full LR models containing main and interaction effects. We furthermore showed that RF measures were particularly sensitive to data imputation. The most consistent pairwise effect on rheumatoid arthritis was found between two markers within MAP3K7IP2/SUMO4 on 6q25.1, although LR and RFs assigned different significance levels. Within a hypothetical two-stage design, pairwise LR analysis of all markers with significant RF single importance would have reduced the number of possible combinations in our small data set by 61%, whereas joint importance measures would have been less efficient for marker pair reduction. This suggests that RF single importance measures, which are able to detect a wide range of interaction effects and are computationally very efficient, might be exploited as pre-screening tool for larger association studies. Follow-up analysis, such as by LR, is required since RFs do not indicate highrisk genotype combinations.
  • Hamshere, M. L., Segurado, R., Moskvina, V., Nikolov, I., Glaser, B., & Holmans, P. A. (2007). Large-scale linkage analysis of 1302 affected relative pairs with rheumatoid arthritis. BMC Proceedings, 1 (Suppl 1), S100.

    Abstract

    Rheumatoid arthritis is the most common systematic autoimmune disease and its etiology is believed to have both strong genetic and environmental components. We demonstrate the utility of including genetic and clinical phenotypes as covariates within a linkage analysis framework to search for rheumatoid arthritis susceptibility loci. The raw genotypes of 1302 affected relative pairs were combined from four large family-based samples (North American Rheumatoid Arthritis Consortium, United Kingdom, European Consortium on Rheumatoid Arthritis Families, and Canada). The familiality of the clinical phenotypes was assessed. The affected relative pairs were subjected to autosomal multipoint affected relative-pair linkage analysis. Covariates were included in the linkage analysis to take account of heterogeneity within the sample. Evidence of familiality was observed with age at onset (p <} 0.001) and rheumatoid factor (RF) IgM (p {< 0.001), but not definite erosions (p = 0.21). Genome-wide significant evidence for linkage was observed on chromosome 6. Genome-wide suggestive evidence for linkage was observed on chromosomes 13 and 20 when conditioning on age at onset, chromosome 15 conditional on gender, and chromosome 19 conditional on RF IgM after allowing for multiple testing of covariates.
  • Segurado, R., Hamshere, M. L., Glaser, B., Nikolov, I., Moskvina, V., & Holmans, P. A. (2007). Combining linkage data sets for meta-analysis and mega-analysis: the GAW15 rheumatoid arthritis data set. BMC Proceedings, 1(Suppl 1): S104.

    Abstract

    We have used the genome-wide marker genotypes from Genetic Analysis Workshop 15 Problem 2 to explore joint evidence for genetic linkage to rheumatoid arthritis across several samples. The data consisted of four high-density genome scans on samples selected for rheumatoid arthritis. We cleaned the data, removed intermarker linkage disequilibrium, and assembled the samples onto a common genetic map using genome sequence positions as a reference for map interpolation. The individual studies were combined first at the genotype level (mega-analysis) prior to a multipoint linkage analysis on the combined sample, and second using the genome scan meta-analysis method after linkage analysis of each sample. The two approaches were compared, and give strong support to the HLA locus on chromosome 6 as a susceptibility locus. Other regions of interest include loci on chromosomes 11, 2, and 12.
  • Ziegler, A., DeStefano, A. L., König, I. R., Bardel, C., Brinza, D., Bull, S., Cai, Z., Glaser, B., Jiang, W., Lee, K. E., Li, C. X., Li, J., Li, X., Majoram, P., Meng, Y., Nicodemus, K. K., Platt, A., Schwarz, D. F., Shi, W., Shugart, Y. Y. and 7 moreZiegler, A., DeStefano, A. L., König, I. R., Bardel, C., Brinza, D., Bull, S., Cai, Z., Glaser, B., Jiang, W., Lee, K. E., Li, C. X., Li, J., Li, X., Majoram, P., Meng, Y., Nicodemus, K. K., Platt, A., Schwarz, D. F., Shi, W., Shugart, Y. Y., Stassen, H. H., Sun, Y. V., Won, S., Wang, W., Wahba, G., Zagaar, U. A., & Zhao, Z. (2007). Data mining, neural nets, trees–problems 2 and 3 of Genetic Analysis Workshop 15. Genetic Epidemiology, 31(Suppl 1), S51-S60. doi:10.1002/gepi.20280.

    Abstract

    Genome-wide association studies using thousands to hundreds of thousands of single nucleotide polymorphism (SNP) markers and region-wide association studies using a dense panel of SNPs are already in use to identify disease susceptibility genes and to predict disease risk in individuals. Because these tasks become increasingly important, three different data sets were provided for the Genetic Analysis Workshop 15, thus allowing examination of various novel and existing data mining methods for both classification and identification of disease susceptibility genes, gene by gene or gene by environment interaction. The approach most often applied in this presentation group was random forests because of its simplicity, elegance, and robustness. It was used for prediction and for screening for interesting SNPs in a first step. The logistic tree with unbiased selection approach appeared to be an interesting alternative to efficiently select interesting SNPs. Machine learning, specifically ensemble methods, might be useful as pre-screening tools for large-scale association studies because they can be less prone to overfitting, can be less computer processor time intensive, can easily include pair-wise and higher-order interactions compared with standard statistical approaches and can also have a high capability for classification. However, improved implementations that are able to deal with hundreds of thousands of SNPs at a time are required.

Share this page