Statistical genetics papers, volume 1

This curated list of papers were created during 2018-2023 to cover key developments in statistical genetics, including conceptual frameworks, technical (statistical and computational) advances, and practical applications. While some selections reflect historical significance and others align with our lab’s research focus, the collection aims to illustrate the fundamental approaches and reasoning that have shaped statistical genetics over the past 15 years. These papers serve as general educational references rather than specific sources of statistical methodology or scientific discoveries.

This volume of papers are covered as background knowledge during lab meetings between 2023 - 2025, mainly in Spring 2025.

Introductory

  • Bayesian statistical methods for genetic association studies. [NRG, 2009]
  • The role of regulatory variation in complex traits and disease [NRG, 2015]
  • Dissecting the genetics of complex traits using summary association statistics [NRG, 2017]
  • From genome-wide associations to candidate causal variants by statistical fine-mapping [NRG, 2018]
  • The personal and clinical utility of polygenic risk scores [NRG, 2018]
  • A brief history of human disease genetics Nature, 2020
  • Genetic and molecular architecture of complex traits Cell, 2024

Family-based linkage and association

  • Parametric and nonparametric linkage analysis: a unified multipoint approach. [AJHG, 1996]
  • The TDT and other family-based tests for linkage disequilibrium and association. [AJHG, 1996]
  • The family based association test method: strategies for studying general genotype-phenotype associations. [EJHG, 2001]

GWAS introduction

  • Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. [WTCCC paper, Nature, 2007]
  • Genetic mapping in human disease. [Science, 2008]
  • Genome-wide association studies for complex traits: consensus, uncertainty and challenges. [NRG, 2008]
  • How to interpret a genome-wide association study. [JAMA, 2008]
  • Five years of GWAS discovery. [AJHG, 2012]
  • 10 Years of GWAS Discovery: Biology, Function, and Translation. [AJHG, 2017]
  • Genome-wide association analyses of post-traumatic stress disorder and its symptom subdomains in the Million Veteran Program Nature Genetics, 2020
  • Genome-wide association studies. [Nature Reviews Methods Primers, 2021]
  • A saturated map of common genetic variants associated with human height Nature, 2022
  • Multi-ancestry Genome-wide Association Study of Varicose Veins medRxiv, 2022

Population-specific GWAS and Biobanks

  • Comparative genetic architectures of schizophrenia in East Asian and European populations Nature Genetics, 2019
  • Whole genome sequencing in the Middle Eastern Qatari population identifies genetic associations with 45 clinically relevant traits Nature Communications, 2021
  • FinnGen provides genetic insights from a well-phenotyped isolated population Nature, 2023
  • Mexican Biobank advances population and medical genomics of diverse ancestries Nature, 2023

Rare variant in complex traits

  • Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. [AJHG, 2008]
  • Rare-Variant Association Testing for Sequencing Data with the Sequence Kernel Association Test. [AJHG, 2011]
  • Optimal tests for rare variant effects in sequencing association studies. [Biostatistics, 2012]
  • General Framework for Meta-analysis of Rare Variants in Sequencing Association Studies. [AJHG, 2013]
  • Meta-analysis of gene-level tests for rare variant association. [Nature Genetics, 2014]
  • A genomic mutational constraint map using variation in 76,156 human genomes Nature, 2023

Statistical fine-mapping

  • Identifying Causal Variants at Loci with Multiple Signals of Association. [Genetics, 2014]
  • FINEMAP: efficient variable selection using summary data from genome-wide association studies. [Bioinformatics, 2016]
  • Molecular QTL discovery incorporating genomic annotations using Bayesian false discovery rate control. [AoAS, 2016]
  • A simple new approach to variable selection in regression, with application to genetic fine-mapping. [JRSS-B, 2020]
  • Fine-mapping from summary data with the “Sum of Single Effects” model. [PLoS Genetics, 2022]
  • A practical view of fine-mapping and gene prioritization in the post-genome-wide association era. [Open Biology, 2020]

Molecular QTLs and Multi-omics

  • A Multi-Omics Perspective of Quantitative Trait Loci in Precision Medicine. [Trends in Genetics, 2020]
  • Molecular Quantitative Trait Locus Mapping in Human Complex Diseases. [Current Protocols, 2020]
  • Genetic Control of Expression and Splicing in Developing Human Brain Informs Disease Mechanisms Cell, 2019
  • Plasma proteomic associations with genetics and health in the UK Biobank Nature, 2023
  • Isoform-level transcriptome-wide association uncovers genetic risk mechanisms for neuropsychiatric disorders in the human brain Nature Genetics, 2023

Functional genomics of complex traits

  • From GWAS to Function: Using Functional Genomics to Identify the Mechanisms Underlying Complex Diseases. [Front Genet. 2020]
  • Allele-specific open chromatin in human iPSC neurons elucidates functional disease variants Science, 2020
  • Regulatory genomic circuitry of human disease loci by integrative epigenomics Nature, 2020
  • The GTEx Consortium atlas of genetic regulatory effects across human tissues Science, 2020
  • Widespread signatures of natural selection across human complex traits and functional genomic categories Nature Communications, 2021
  • Trans Effects on Gene Expression Can Drive Omnigenic Inheritance Cell, 2019

Post-GWAS methods and tools

  • A robust and efficient method for Mendelian randomization with hundreds of genetic variants Nature Communications, 2019
  • Tractor uses local ancestry to enable the inclusion of admixed individuals in GWAS and to boost power Nature Genetics, 2021
  • On the problem of inflation in transcriptome-wide association studies bioRxiv, 2023
  • Deep learning-based phenotype imputation on population-scale biobank data increases genetic discoveries Nature Genetics, 2023
  • Effective gene expression prediction from sequence by integrating long-range interactions Nature Methods, 2021

LD score regression and functional enrichment

  • LD Score regression distinguishes confounding from polygenicity in genome-wide association studies [Nature Genetics, 2015]
  • Partitioning heritability by functional annotation using genomewide association summary statistics. [Nature Genetics, 2015]
  • Functionally-informed fine-mapping and polygenic localization of complex trait heritability Nature Genetics, 2020

Polygenic risk scores and clinical applications

  • Current clinical use of polygenic scores will risk exacerbating health disparities Nature Genetics, 2019
  • PGS-server: accuracy, robustness and transferability of polygenic score methods for biobank scale studies Briefings in Bioinformatics, 2022
  • Improving Polygenic Prediction in Ancestrally Diverse Populations Nature Genetics, 2022
  • Global Biobank analyses provide lessons for developing polygenic risk scores across diverse cohorts Cell Genomics, 2023
  • Polygenic scoring accuracy varies across the genetic ancestry continuum Nature, 2023

Drug discovery and disease mechanisms