Background

An initiative in response to the National Alzheimer’s Project Act (NAPA) has been working towards new biological insights and cures for Alzheimer’s Disease (AD) since its introduction by NIH on February 7, 2012. Aptly named the Alzheimer’s Disease Sequencing Project (ADSP), the project is sequencing and analyzing the genomes of a large number of well-characterized individuals in order to identify a broad range of AD risk and protective gene variants. The ultimate goal is to facilitate the identification of new pathways for therapeutic approaches and prevention. The analysis will also provide insight as to why individuals with known risk factor genes escape from developing AD.

The overarching goals of the ADSP are to: (1) identify new genomic variants contributing to increased risk of developing Late-Onset Alzheimer’s Disease (LOAD), (2) identify new genomic variants contributing to protection against developing Alzheimer’s Disease (AD) and Related Dementias (RD), (3) provide insight as to why individuals with known risk factor variants escape from developing AD, and (4) examine these factors to identify new genetically driven pathways leading to potential therapeutic approaches to disease prevention. (5) examine these factors in multi-ethnic populations. .Such a study of human genomic variation and its relationship to health and disease requires examination of a large number of study participants and needs to capture information about common and rare variants (both single nucleotide and copy number) together with high quality, rich phenotypes such as neuropathology, cognitive and neurological/neuropsychiatric assessments, imaging, and known and potential comorbidity and life-style risk factors. The ADSP conducts and facilitates analysis of sequence data to extend previous discoveries that may ultimately result in new directions for AD therapeutics. Data are being made available to the scientific community through the NIA Genetics of Alzheimer’s Disease Data Storage Site (NIAGADS), a NIA-designated qualified access repository for AD and related dementia human genetics and genomics data. Investigators who are outside of the ADSP are encouraged to access and analyze these data.

From 2012 through 2017 the National Human Genome Research Institute (NHGRI) funded Large Scale Sequencing and Analysis Centers (LSACs): Baylor College of Medicine Human Genome Sequencing Center, the Broad Institute, the McDonnell Genome Institute at Washington University, and the New York Genome Center, participated in generating whole genome and whole exome sequence data for the first part of the study. In 2018, and Department of Defense-funded Uniformed Services University of the Health Sciences (USUHS), The American Genome Center (TAGC), began participating in the project.

The ADSP has moved through several phases during its early evolution. It includes several study approaches including case-control, epidiomology, and family based studies.

The ADSP Discovery Phase

The initial phase of the ADSP research plan is called the Discovery Phase. Samples were selected from well-characterized study cohorts of individuals with or without an AD diagnosis and the presence or absence of known risk factor genes. The ADSP generated three sets of genome sequence data for these samples as part of the Discovery Phase: (1) WGS for 584 samples from 113 multiplex families (two or more affected per family), (2) Whole Exome Sequence (WES) for 5,096 AD cases and 4,965 controls, and (3) WES of an Enriched sample set comprised of 853 AD cases from multiply affected families and 171 Hispanic controls. The Case-Control and Enriched Case Study spans 24 cohorts provided by the Alzheimer’s Disease Genetics Consortium (ADGC) and the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium.

As part of the Discovery Phase, the NIA ADSP genetics investigators funded under PAR-12-183 and the NHGRI funded Large Scale Sequencing and Analysis Centers (LSACs) conducted analysis of sequence data, including quality assessments and variant calling. Analysis of the Discovery Phase sequence data is anticipated to identify many new variations in the genome that may be implicated as new genetic risk or protective factors in older adults at risk for AD.

Because the initial analysis of WGS data in subjects from families multiply affected with AD revealed the occurrence of variations in the genome that were intergenic and intronic, in February of 2016 the external consultants to the ADSP recommended that further sequencing for the project should be of whole genomes.

The fully quality control checked (QC’d) data for the Discovery Phase study using Genome Reference Consortium Human Build 37 (GRCh37) was released in March of 2016 through the database of Genotypes and Phenotypes (dbGaP). Discovery Phase data called on Genome Reference Consortium Human Build 38 (GRCh38) are being shared through NIAGADS. Applicants for sequence data can obtain: (1) cleaned, quality control checked sequence data, (2) information on the composition of the study cohorts (e.g. case-control, family based, and epidemiology cohorts), (3) descriptions of the study cohorts included in the analysis, (4) accompanying phenotypic information such as age at disease onset, gender, diagnostic status, and cognitive measures, and (5) epidemiological information such as educational level and certain demographic data available on the subjects genotyped.

The ADSP Discovery Extension Phase

The ADSP Discovery Family-Based Extension Study:

To further assess the genomes in multiply affected families, under funding provided by NHGRI, an additional 427 samples were whole genome sequenced. This included 107 additional samples from families studied under the Discovery Phase, 175 samples from 47 new families, and 145 Hispanic Controls. This portion of the study is called the Discovery Extension Phase. The Family Based Study spans seven cohorts provided by the Alzheimer’s Disease Genetics Consortium (ADGC) and the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium.

The ADSP Discovery Case-Control Based Extension Study:

Under funding provided by NHGRI, an additional 3,000 subjects were whole genome sequenced. This included 1,466 cases and 1,534 controls. Of these 1,000 each of Non-Hispanic White (NHW), Caribbean Hispanic (CH), and African American (AA) descent were sequenced. Of these a total of 739 autopsy samples were sequenced [568 cases (500 NHW cases and 68 AA cases) and 171 controls (164 NHW and 7 AA)]. The Case-Control and Enriched Case Study spans 24 cohorts provided by the Alzheimer’s Disease Genetics Consortium (ADGC) and the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium.

The ADSP Follow-up Study (FUS)

The ADSP Discovery Phase identified a number of variations in the genomes of individuals affected with AD. These findings are being pursued in the ADSP Follow-Up Study (FUS), funded solely by NIA. The long-term goals of the ADSP FUS are to:

  • Move the field closer to enabling prediction of who will develop AD
  • Fully reveal the genetic architecture of AD in multiple ethnic groups
  • Better understand the underpinnings of AD pathogenesis
  • Aid the quest for therapeutic targets
  • Examine the AD genome in diverse populations

The ADSP Discovery Phase and the ADSP FUS are described under PAR-16-406. The ADSP FUS is leveraging existing infrastructure and collaborations to ensure continuity of ADSP participation. It provides funds for acquisition, archiving, sequencing, quality control, genome wide association studies (GWAS), and data sharing of the large number of samples from individuals affected by AD for WGS, as appropriate. Racial/ethnic diversity is an ongoing high NIA and ADSP priority. Well-phenotyped participants were selected with an emphasis on autopsy-confirmed and ethnically diverse cases/controls and availability of longitudinal data. Funds are being provided for both sequencing and data analysis. This effort is pursuing rare variants as comprehensively as possible, including consideration of statistical power, and exploration of a range of different populations containing those that are currently underrepresented in sequencing studies.

The majority of the samples from the ADSP Discovery and Discovery Extension phases were non-Hispanic white in origin, making the addition of ethnically diverse samples to the study critical to identification of both shared and novel genetic risk factors for AD among populations. Collection and sequencing of ethnically diverse cohorts is emphasized in the ADSP FUS, the goal being that additional existing cohorts with unrelated AD cases that encompass the richest possible ethnic diversity be given the highest priority for inclusion. For the United States this includes augmenting African American, Hispanic, and Asian cohorts.

Variants occur at different frequencies in different populations and certain risk variants may be much easier to detect in some populations. ADSP studies in ethnic groups including African American, Hispanic, and Asian remain statistically underpowered, so the genetics of these populations remain largely unstudied. Therefore, a major effort is being undertaken to augment the numbers of cases and controls in ethnically diverse populations in the United States. In order to understand the underlying substructure of the diversity populations, global studies are a key component of this effort.

To fulfill the goals of this ADSP FUS, cohorts of primary African Ancestry with a total of 8,863 participants; Hispanic/Latino and Amerindian Ancestry with 9,754 participants; Asian Ancestry with 7,000 participants and European Ancestry with 13,613 participants, were whole genome sequenced at The American Genome Center (TAGC) at the Uniformed Services University of the Health Sciences (USUHS) and the Center for Genome Technology John P. Hussman Institute for Human Genomics (HIHG CGT) in coordination with existing NIH-funded AD infrastructure including the National Cell Repository for Alzheimer’s Disease (NCRAD), NIAGADS, and the Genome Center for Alzheimer’s Disease (GCAD). Also included with the European Ancestry cohorts are Brain Autopsy participants with 1,058 Cases and 165 Controls that were sequenced. Cohort collection, phenotypic characterization, and whole genome sequencing were funded by the NIA.  This and additional information about the ADSP can be found on the National Institutes on Aging site.

The global effort brings important population sectors that were not previously well represented into the ADSP. Studies in the initial phase of the FUS have been supported by PAR-19-234 and PAR-17-214. The sequencing and analysis done under those FOAs have increased the numbers of participants and the volume of data. Data generated under this part of the ADSP will require novel methods to perform in-depth and subgroup analyses of diverse ethnic backgrounds, as well as integrated analyses to completely unravel the architecture of the AD genome. The ADSP FUS set the stage for the next wave of ADPS Studies called the ADSP Follow UP Study 2.0. The reach for this effort is global and includes Central and South America, Africa (9 countries), and Asia (India and Korea), and Australia, with additional efforts being planned.

The ADSP Follow-up Study (FUS) 2.0

The ADSP Follow-Up Study 2.0: The Diverse Population Initiative (PAR-21-212) was launched in 2021 to expand the sample set in ADSP to represent more diverse populations. The long-term goals of the ADSP FUS 2.0 are to:

  1. move the field closer to enabling prediction of who will develop AD;
  2. fully characterize AD subtypes by studying endophenotypes in diverse populations;
  3. better understand the differences in the genetic underpinnings of AD pathogenesis among diverse populations; and
  4. identify specific therapeutic targets based upon diverse population.

Numbers of Hispanic/Latino and Black/African American participants in the US remain insufficient to provide statistical significance for identification of rare or very rare variants. Variants in the Alzheimer’s genome are largely rare or very rare in the population. It is estimated that for 80% certainty for single variant testing for rare variants, ~16,100 cases and ~16,100 controls are needed for a variant with a minor allele frequency of 0.5% in the population; single variant testing for rare variants indicate that for 90% certainty, ~18,500 cases and ~18,500 controls are needed for each population for a variant with a minor allele frequency of 1% in the population. To ensure that there are sufficient numbers of study participants to achieve statistical power for analysis of rare or vary rare variants in the three largest diversity cohorts’ AD/ADRD genome given the available funding, the primary focus of the ADSP FUS 2.0 is on Hispanic/Latino, Black/African American, and Asian populations. Consortia are leveraging cohorts already recruited or in planning for recruitment to obtain sufficient numbers; sharing diversity data across consortia is essential to the success of this effort. Genetic samples and phenotypic data that are analyzed by the ADSP are provided by several consortia, initiatives, centers, and studies.

Some Additional ADSP Collaborations

New ADSP Initiatives

Functional Genomics Consortium. In July 2021, NIA awarded six U01 projects responding to the ADSP Functional Genomics Initiative RFA-AG-21-006. These awards comprise the core projects of the ADSP Functional Genomics Consortium. The Consortium will use a multipronged, team-science strategy and apply high-throughput, genome-wide approaches to discover and validate the functional roles and mechanisms of action of genes and variants underlying AD/ADRD.

ADSP Phenotype Harmonization Consortium (ADSP-PHC). The ADSP-PHC was formed in response to the NIA announcement of Harmonization of Alzheimer’s Disease and Related Dementias (AD/ADRD) Genetic, Epidemiologic, and Clinical Data to Enhance Therapeutic Target Discovery (PAR-20-099). The goal of the ADSP-PHC is to facilitate and perform phenotypic data harmonization for participants with ADSP genetic and genomic data which in turn requires bringing together experts in harmonization of relevant phenotypes. Endophenotypes to be harmonized include cognitive data, imaging, longitudinal clinical data, neuropathological data, cardiovascular risk data, and biomarkers. The harmonized phenotypic data will become a “legacy” dataset and will be perpetually curated and shared through a central data repository.

Machine Learning and Artificial Intelligence Consortium. In order to utilize the vast amount of data generated by the ADSP and other NIA funded initiatives, the NIA issued Cognitive Systems Analysis of Alzheimer’s Disease Genetic and Phenotypic Data (PAR-19-269) to apply cognitive systems approaches to the analysis of AD genetic and related data. Analysis of the data generated and harmonized by the ADSP will help to identify new genes and genetic pathways that will reveal risk and protective factors for AD and guide the field toward novel therapeutic approaches to the disease.

Detailed information about these three new NIA initiatives is located on the NIA website.

Last updated 04/13/22