Functional Genomics

Research Overview

Our group focuses on studying the pathogenesis of neurological disorders, in particular Alzheimer’s disease and multiple sclerosis, through integrating large-scale multi-omics as well as medical data in cohorts of diverse ancestry. Our work spans a range of subjects at the interface of genetics, statistics & machine learning, bioinformatics, and neurological sciences.

Association studies of molecular quantitative trait loci in brain and central nervous system: Understanding the genetic alternations and regulations of the Central Dogma of Biology in tissue and cell type specific contexts is crucial to the study of complex human diseases. One key effort in our group is to define and quantify the impact of genetic variations on a variety of molecular phenotypes in human brain and central nervous system (CNS). These genetic variations are collectively known as Molecular Quantitative Trait Loci, or xQTLs. Through collaboration with research groups across the country, we discover xQTL data in brain and CNS tissues as well as single cells, on a variety of molecular phenotypes. These phenotypes include methylation, chromatin accessibility, histone acetylation, transcription factor binding activity, gene expression, alternative polyadenylation, microRNA, alternative splicing, proteomics, lipidomics and metabolomics. In analyzing these xQTLs, we establish computational protocols to facilitate data standardization and downstreams integration of xQTL data from different sources. We also develop new statistical methods for the quantification and analysis of certain molecular phenotypes such as methylation and alternative polyadenylation.

Genetic and functional genomics data integration for neurodegenerative disorders: We leverage whole genome sequences, brain and CNS xQTLs as well as a breadth of relevant epigenomic annotation experimental assay data to aid in the discovery of genetic variations for neurodegenerative disorders, in particular Alzheimer’s disease and Multiple Sclerosis. We develop and apply improved computational methods in statistical modeling and maching learning for fine-mapping, colocalization, functional variant prediction, transcriptome-wide association, and gene-set and pathway discovery. Development of new approaches are largely motivated by the needs to facilicate biological interpretation and increase the capacity to cope with large-scale data applications in multi-omics analysis.

Integrative approaches for Alzheimer’s disease gene discovery in diverse cohorts: We sake to characterize the genetic heterogeneity of late-onset Alzheimer’s disease (LOAD) and genetic comorbidity between LOAD and other complex traits in diverse cohorts. Using large longitudinal population as well as LOAD specific cohorts for admixed populations such as Hispanics, we develop study designs and methods to dissect LOAD genetic risk factors shared among diseases and across ancestry backgrounds. Functional genomic data are integrated to aid in the discovery of rare and / or non-coding genetic variants (including copy number variations). We develop approaches to profile LOAD families and patients using genetic, functional genomic and pathology information at all levels to characterize and uncover heterogeneity within LOAD.

Development of functional genomics data and computational resources: One important mission of the group is to curate data resource and build reliable and user-friendly computational tools and bioinformatics pipelines for sharing with the community to study similar problems in other tissue, cell and disease contexts. We also develop reproducible, open source benchmarks for evaluation of statistical methods in such a fashion that facilitates the sharing and extension of benchmarks by other research groups.

On the group wiki page you can find more details on the ongoing projects and links to the code repositories associated with these projects.

If you are interested in joining our team related to any of these research directions, please contact us at: wang.gao@columbia.edu.