Background

An initiative that is responsive to the National Alzheimer’s Project Act (NAPA) was announced to fight Alzheimer’s Disease (AD) on February 7, 2012. The project is called the Alzheimer’s Disease Sequencing Project (ADSP). The project is sequencing and analyzing the genomes of a large number of well characterized individuals in order to identify a broad range of AD risk and protective gene variants. The ultimate goal is to facilitate the identification of new pathways for therapeutic approaches and prevention. The analysis will also provide insight as to why individuals with known risk factor genes escape from developing AD.

The overarching goals of the ADSP are to: (1) identify new genomic variants contributing to increased risk of developing Late-Onset Alzheimer’s Disease (LOAD), (2) identify new genomic variants contributing to protection against developing Alzheimer’s Disease (AD), (3) provide insight as to why individuals with known risk factor variants escape from developing AD, and (4) examine these factors in multi-ethnic populations as applicable in order to identify new pathways for disease prevention. These factors will be studied in multi-ethnic populations to identify new pathways for disease prevention. Such a study of human genomic variation and its relationship to health and disease requires examination of a large number of study participants and needs to capture information about common and rare variants (both single nucleotide and copy number) in well phenotyped individuals. The ADSP conducts and facilitates analysis of sequence data to extend previous discoveries that may ultimately result in new directions for AD therapeutics. Data are being made available to the scientific community through NIH-approved data repositories including the database for genotypes and phenotypes (dbGaP) and the NIA Genetics of Alzheimer’s Disease Data Storage Site (NIAGADS). Investigators who are outside of the ADSP are encouraged to access and analyze these data.

From 2012 through 2017 the National Human Genome Research Institute (NHGRI) funded Large Scale Sequencing and Analysis Centers (LSACs): Baylor College of Medicine Human Genome Sequencing Center, the Broad Institute, the McDonnell Genome Institute at Washington University, and the New York Genome Center, participated in generating whole genome and whole exome sequence data for the first part of the study. In 2018, and Department of Defense-funded Uniformed Services University of the Health Sciences (USUHS), The American Genome Center (TAGC), began participating in the project.

The ADSP research plan includes:

The ADSP Discovery Phase

The initial phase of the ADSP research plan is called the Discovery Phase. Samples were selected from well-characterized study cohorts of individuals with or without an AD diagnosis and the presence or absence of known risk factor genes. The ADSP generated three sets of genome sequence data for these samples as part of the Discovery Phase: (1) WGS for 584 samples from 113 multiplex families (two or more affected per family), (2) Whole Exome Sequence (WES) for 5,096 AD cases and 4,965 controls, and (3) WES of an Enriched sample set comprised of 853 AD cases from multiply affected families and 171 Hispanic controls. The Case-Control and Enriched Case Study spans 24 cohorts provided by the Alzheimer’s Disease Genetics Consortium (ADGC) and the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium.

As part of the Discovery Phase, the NIA ADSP genetics investigators funded under PAR-12-183 and the NHGRI funded Large Scale Sequencing and Analysis Centers (LSACs) conducted analysis of sequence data, including quality assessments and variant calling. Analysis of the Discovery Phase sequence data is anticipated to identify many new variations in the genome that may be implicated as new genetic risk or protective factors in older adults at risk for AD.

Because the initial analysis of WGS data in subjects from families multiply affected with AD revealed the occurrence of variations in the genome that were intergenic and intronic, in February of 2016 the external consultants to the ADSP recommended that further sequencing for the project should be of whole genomes.

The fully quality control checked (QC’d) data for the Discovery Phase study using Genome Reference Consortium Human Build 37 (GRCh37) was released in March of 2016 through the database of Genotypes and Phenotypes (dbGaP). Discovery Phase data called on Genome Reference Consortium Human Build 38 (GRCh38) are being shared through NIAGADS. Applicants for sequence data can obtain: (1) cleaned, quality control checked sequence data, (2) information on the composition of the study cohorts (e.g. case-control, family based, and epidemiology cohorts), (3) descriptions of the study cohorts included in the analysis, (4) accompanying phenotypic information such as age at disease onset, gender, diagnostic status, and cognitive measures, and (5) epidemiological information such as educational level and certain demographic data available on the subjects genotyped.

The ADSP Discovery Extension Phase

The ADSP Discovery Family-Based Extension Study:

To further assess the genomes in multiply affected families, under funding provided by NHGRI, an additional 427 samples were whole genome sequenced. This included 107 additional samples from families studied under the Discovery Phase, 175 samples from 47 new families, and 145 Hispanic Controls. This portion of the study is called the Discovery Extension Phase. The Family Based Study spans seven cohorts provided by the Alzheimer’s Disease Genetics Consortium (ADGC) and the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium.

The ADSP Discovery Case-Control Based Extension Study:

Under funding provided by NHGRI, an additional 3,000 subjects were whole genome sequenced. This included 1,466 cases and 1,534 controls. Of these 1,000 each of Non-Hispanic White (NHW), Caribbean Hispanic (CH), and African American (AA) descent were sequenced. Of these a total of 739 autopsy samples were sequenced [568 cases (500 NHW cases and 68 AA cases) and 171 controls (164 NHW and 7 AA)]. The Case-Control and Enriched Case Study spans 24 cohorts provided by the Alzheimer’s Disease Genetics Consortium (ADGC) and the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium.

The ADSP Follow-up Study (FUS)

A majority of the samples from the ADSP Discovery and Discovery Extension phases are non-Hispanic white (NHW) in origin, making the addition of ethnically diverse samples to the study critical to identification of both shared and novel genetic risk factors for Alzheimer’s disease (AD) between populations. Collection and sequencing of ethnically diverse cohorts is emphasized in the ADSP FUS, the goal being that additional existing cohorts with unrelated AD cases that encompass the richest possible ethnic diversity be given the highest priority for inclusion.

To fulfill the goals of this ADSP FUS, eight existing elderly cohorts of African-American (AA) and pan-Hispanic (pan-HI) ancestry with a total of 13,745 samples (N=2,456 AA AD cases and 4,126 AA controls and 2,588 Hispanic AD cases and 4,475 Hispanic controls) are being whole genome sequenced (WGS) at USUHS TAGC in coordination with existing NIH-funded AD infrastructure including the National Cell Repository for Alzheimer’s Disease (NCRAD), NIAGADS, and the Genome Center for Alzheimer’s Disease (GCAD). 1,500 NHW autopsy cases and 1,500 controls are also being sequenced to increase the underpowered NHW sample with WGS.

The ADSP Augmentation Phase

The ADSP Augmentation Phase encompasses sequencing done under private and NIH funding by investigators who are not members of the ADSP. The investigators for these studies have agreed to share their GWAS, WGS and WES data with the ADSP. Private funding has been provided by industry and anonymous donors. Under the NIA AD Genetics Sharing Policy and the NIAGADS Data Distribution Agreement, individual NIA funded investigators studying the genetics and the genomics of AD provide their data to NIAGADS, and in turn these data will be shared with the ADSP. These data will be made publically available as soon as they are fully QC’d and harmonized with ADSP data.

For more information about the ADSP, see the study description on the ADSP website.

Funding Sources for ADSP Discovery and Discovery Extension Data Analysis: PAR-12-183
  • UF1 AG047133. Consortium for Alzheimer’s Sequence Analysis (CASA). University of Pennsylvania, Philadelphia PA, Columbia University, NY, NY; Miami University, Mimi Fla; Case Western Reserve University, Cleveland, OH.; Boston University, Boston, MA.
  • U01 AG049505. CHARGE: Identifying Risk & Protective SNV for AD in ADSP Case-control Sample. Boston University.
  • U01 AG049506. Sequence-based Discovery of AD Risk & Protective Alleles. Baylor University, Houston, TX.
  • U01 AG049507. Sequence-based Discovery of AD Risk & Protective Alleles. University of Washington, Seattle, Washington
  • U01-AG-049508. Modifier Genes that Influence Age at Onset or Protect Against Development of Alzheimer’s Disease (AD); Icahn School of Medicine at Mount Sinai.
Funding Sources for Infrastructure
  • U24 AG041689. The National Institute on Aging Genetics of Alzheimer’s Disease Data Storage Site, National Institute on Aging, National Institutes of Health, Bethesda, MD, USA.
  • U01 AG032984. Alzheimer’s Disease Genetics Consortium, University of Pennsylvania, Philadelphia, PA, USA.
  • R01 AG033193. Cohorts for Heart and Aging Research in Genomic Epidemiology, Boston University, Boston, MA, USA.
  • U24 AG021886. National Cell Repository for Alzheimer’s Disease, Indiana University, Bloomington, IN, USA.
  • U24 AG056270. The National Institute on Aging (NIA) Late Onset of Alzheimer’s Disease (LOAD) Family-Based Study (FBS), Columbia University, NY, NY; Indiana University, Indianapolis, IN; Icahn School of Medicine at Mount Sinai.
  • The Alzheimer’s Disease Centers provided samples for the ADSP. They are funded under a number of grants and cooperative agreements.
Funding Sources for Sequencing: ADSP Discovery Phase
  • U54HG003079. National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
  • U54HG003273. National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
  • U54HG003076. National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
Funding Sources for Sequencing: ADSP Follow-Up Study PAR-16-406
  • U01 AG057659. Whole Genome Sequencing in Ethnically Diverse Cohorts for the ADSP Follow-Up Study (FUS). Miami University, Miami. FLA; Columbia University, New York, NY. Sequencing was funded as a subcontract to Uniformed Services University for the Health Sciences (USUHS), The American Genome Center (TAGC).

Acknowledgment statement for any data distributed by NIAGADS:

Data for this study were prepared, archived, and distributed by the National Institute on Aging Alzheimer’s Disease Data Storage Site (NIAGADS) at the University of Pennsylvania (U24-AG041689), funded by the National Institute on Aging.

For investigators using ADSP data:

Please reference the use of NIAGADS data by including the accession NG00067 as well as including the ADSP acknowledgment statement in any publications.

  1. Nafikov RA. Analysis of pedigree data in populations with multiple ancestries: Strategies for dealing with admixture in Caribbean Hispanic families from the ADSP. Genet Epidemiol. 2018 Jun. doi: 10.1002/gepi.22133. PubMed link
  2. Naj AC. Quality control and integration of genotypes from two calling pipelines for whole genome sequence data in the Alzheimer’s disease sequencing project. Genomics. 2018 May. pii: S0888-7543(18)30281-7. PubMed link
  3. Raghavan S. Whole-exome sequencing in 20,197 persons for rare variants in Alzheimer’s disease. Annals of Clinical and Translational Neurology. 2018 Apr; 5(7): 832-842. bioRxiv link
  4. Butkiewicz M. Functional Annotation of genomic variants in studies of Late-Onset Alzheimer’s Disease. Bioinformatics. 2018 Mar; doi: 10.1093/bioinformatics/bty177. PubMed link
  5. Vardarajan BN. Whole genome sequencing of Caribbean Hispanic families with late-onset Alzheimer’s disease. Ann Clin Transl Neurol. 2018 Mar; 5(4): 406-417. PubMed link
  6. Blue EE. Genetic Variation in Genes Underlying Diverse Dementias May Explain a Small Proportion of Cases in the Alzheimer’s Disease Sequencing Project. Dement Geriatr Cogn Disord. 2018 Feb; 45(1-2): 1-17. PubMed link
  7. Crane PK. Alzheimer’s Disease Sequencing Project discovery and replication criteria for cases and controls: Data from a community-based prospective cohort study with autopsy follow-up. Science Direct. 2017 Dec; 13(10): 1410-43. PubMed link
  8. Beecham GW. The Alzheimer’s Disease Sequencing Project: Study design and sample selection. Neurol Genet. 2017 Oct; 3(5): e194. PubMed link
  9. Hollingworth P. Common variants at ABCA7, MS4A6A/MS4A4E, EPHA1, CD33 and CD2AP are associated with Alzheimer’s disease. Nat Genet. 2011 May; 43(5):429-35.
    PubMed link
  10. Naj AC. Common variants at MS4A4/MS4A6E, CD2AP, CD33 and EPHA1 are associated with late-onset Alzheimer’s disease. Nat Genet. 2011 May; 43(5):436-41.
    PubMed link
  11. Seshadri S. Genome-wide analysis of genetic loci associated with Alzheimer disease. JAMA. 2010 May 12; 303(18):1832-40.
    PubMed link
  12. Harold D. Genome-wide association study identifies variants at CLU and PICALM associated with Alzheimer’s disease. Nat Genet. 2009 Oct; 41(10):1088-93.
    PubMed link
  13. Lambert JC. Genome-wide association study identifies variants at CLU and CR1 associated with Alzheimer’s disease. Nat Genet. 2009 Oct; 41(10):1094-9.
    PubMed link