NG00129 - NCRAD Families WES

To access this data, please log into DSS and submit an application.
Within the application, add this dataset (accession NG00129) in the “Choose a Dataset” section.
Once approved, you will be able to log in and access the data within the DARM portal.

Description

This ADSP release, containing 170 whole-exomes includes 1) sequencing read alignments in CRAM (compressed BAM) format for the newly sequenced 170 samples, (2) genomic Variant Call Format (gVCF) files generated by GATK4.1.1 on samples.
The pVCF released here is provided as a preview to the formal ADSP quality control that will be released in a few months. Checks of the dataset are ongoing, and the released files may be subject to change in the full quality-controlled release.

Sample Summary per Data Type

Sample Set	Accession	Data Type	Number of Samples
NCRAD Families WES	snd10039	WES	170

Available Filesets

Fileset	Accession	Latest Release	Description
NCRAD Families WES CRAMs and GATK gVCFs	fsa000037	NG00129.v1	WES CRAMs and GATK gVCFs. Sequencing Data Quality Control Metrics
NCRAD Families Phenotype and Manifest Files	fsa000038	NG00129.v1	Phenotypes for sequenced subjects and connecting family members
NCRAD Families WES Project Level VCF	fsa000039	NG00129.v1	Preview pVCF

View the File Manifest for a full list of files released in this dataset.

This dataset includes WES data on 170 sequenced subjects. Phenotype data also includes connecting family data for the sequenced subjects.

Sample Set	Accession Number	Number of Subjects	Number of Samples
NCRAD Families WES	snd10039	170	170

Consent Level	Number of Subjects
GRU-IRB-PUB	170

Total number of approved DARs: 8

Investigator:
Belloy, Michael
Institution:
Washington University in St Louis
Project Title:
Elucidating sex-specific risk for Alzheimer's disease through state-of-the-art genetics and multi-omics
Date of Approval:
January 6, 2025
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
• Objectives: In this project, we seek to holistically investigate the genetic and molecular drivers of sex dimorphism in Alzheimer’s disease across ancestries. • Study design: This study integrates large-scale population genetics with multi-omics and endophenotype analyses. We are integrating all data available from ADGC and ADSP, together with other data from AMP-AD and biobanks such as UKB, FinnGen, and MVP to conduct large-scale multi-ancestry GWAS, rare-variant gene aggregation analyses, QTL studies, PWAS, TWAS, etc. We also particularly focus on X chromosome association studies. The study design also interrogates interactions with ancestry, hormone exposures, and with APOE*4, as well as comparisons to non-stratified GWAS/XWAS of Alzheimer’s disease. Further, we will also employ genetic correlation analyses, mendelian randomization, colocalization, and pleiotropy analyses, to interrogate overlap with other complex traits to better understand the mechanisms underlying sex dimorphism in Alzheimer’s disease. • Analysis plan, including the phenotypic characteristics that will be evaluated in association with genetic variants: Our phenotypes will include Alzheimer’s disease risk, conversion risk, various endophenotypes (including amyloid/tau biomarkers, brain imaging metrics, etc.) as well as molecular traits. As noted above, we will conduct large-scale multi-ancestry GWAS, XWAS, rare-variant gene aggregation analyses, QTL studies, PWAS, TWAS, etc. Specific aims include interrogating these question and analyses on (1) the autosomes, (2) the X chromosome, and (3) leveraging sex stratified QTL studies to drive discovery of risk genes.
Non-Technical Research Use Statement:
Alzheimer’s disease (AD) manifests itself differently across men and women, but the genetic and molecular factors that drive this remain elusive. AD is the most common cause of dementia and till today remains largely untreatable. It is thus crucial to study the genetics of AD in a sex-specific manner, as this will help the field gain important insights into disease pathophysiology, identify novel sex-specific risk factors relevant to personalized genetic medicine, and uncover potential new AD drug targets that may benefit both sexes. This project uses large-scale genomics and multi-omics to elucidate novel sex agnostic and sex-specific AD risk genes. We will interrogate sex dimorphism for AD risk on the autosomes and the sex chromosomes. We similarly interrogate sex dimorphism in the genetic regulation of gene expression and protein levels, which we will integrate with genetic risk for Alzheimer’s disease to further discovery risk genes. Throughout, we will also interrogate how sex-specific risk for AD interactions with hormone exposures, ancestry, and the APOE*4 risk allele.
Investigator:
Cruchaga, Carlos
Institution:
Washington University School of Medicine
Project Title:
The Familial Alzheimer Sequencing (FASe) Project
Date of Approval:
March 18, 2025
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
The goal of this study is to identify new genes and mutations that cause or increase risk for Alzheimer disease (AD), as well as protective factors. Individuals and families were selected from the Knight-ADRC (Washington University) and the NIA-LOAD study. Only families with at least three first-degree affected individuals were included. Families with pathogenic variants in the known AD or FTD genes, or in which APOE4 segregated with disease were excluded. At least two cases and one control were selected per family. Cases had an age at onset (AAO) after 65 yo and controls had a larger age at last assessment than the latest AAO within the family. Whole exome (WES) and whole genome sequencing (WGS) was generated for 1,235 individuals (285 families) that together with data from our collaborators and the ADSP family-based cohort (3,449 individuals and 757 families) will provide enough statistical power to identify new genes for AD. Dr. Tanzi (Harvard Medical School) will provide WGS from 400 families from the NIMH Alzheimer disease genetics initiative study. We will perform single variant and gene-based analyses to identify genes and variants that increase risk for disease in AD families. Single variant analysis will consist of a combination of association and segregation analyses. We will run family-based gene-based methods to identify genes that show and overall enrichment of variants in AD cases. We will also look for protective and modifier variants. To do this we will identify families loaded with AD cases, that also include individuals with a high burden of known risk variants but that do not develop the disease (escapees). We will use the sequence data and the family structure to identify variants that segregate with the escapee phenotype. The most promising variants and genes will be replicated in independent datasets (ADSP case-control, ADNI, Knight-ADRC, NIA-LOAD ). We will perform single variant and gene-based analyses to replicate the initial findings, and survival analysis to replicate the protective variants. We will select the most promising variants/genes for functional studies
Non-Technical Research Use Statement:
Family-based approaches led to the identification of disease-causing Alzheimer’s Disease (AD) variants in the genes encoding APP, PSEN1 and PSEN2. The identification of these genes led to the A?-cascade hypothesis and to the development of drugs that target this pathway. Recently, we have identified rare coding variants in TREM2, ABCA7, PLD3 and SORL1 with large effect sizes for risk for AD, confirming that rare coding variants play a role in the etiology of AD. In this proposal, we will identify rare risk and protective alleles using sequence data from families densely affected by AD. We hypothesize that these families are enriched for genetic risk factors. We already have sequence data from 695 families (2,462 individuals), that combined with the ADSP and the NIMH dataset will lead to a dataset of more than 1,042 families (4,684 individuals). Our preliminary results support the flexibility of this approach and strongly suggest that protective and risk variants with large effect size will be found, which will lead to a better understanding of the biology of the disease.
Investigator:
Greicius, Michael
Institution:
Stanford University School of Medicine
Project Title:
Examining Genetic Associations in Neurodegenerative Diseases
Date of Approval:
December 19, 2024
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
We are studying the effects of rare (minor allele frequency
Non-Technical Research Use Statement:
Current genetic understanding of Alzheimer’s Disease (AD) does not fully explain its heritability. The APOE4 allele is a well-established risk factor for the development of Alzheimer’s Disease (AD). However, some individuals who carry APOE4 remain cognitively healthy until advanced ages. Additionally, the cause of mixed dementia pathology development in individuals remains largely unexplained. We aim to identify genetic factors associated with these “protected” and mixed pathology phenotypes.
Investigator:
Hatchwell, Eli
Institution:
Population Bio
Project Title:
Mutational Spectrum of Causal Genes for Neurological/Neurodegenerative Diseases and Endometriosis Identified via High Resolution Genome Wide Copy Number Analysis
Date of Approval:
August 21, 2024
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
While single gene rare variants have been shown to play a significant role in Early-Onset Alzheimer’s Disease (EOAD), their role in Late-Onset (LOAD) has not been emphasised. The gene discovery methodology we have developed at Population Bio allows for unbiased exploration of highly informative genomic variants in any cohort of interest. Our approach is based on ultra-high resolution copy number variant (CNV) analysis. We have invested heavily in such analysis on normal populations. These are used as comparators for cohorts of interest, such as LOAD. In our LOAD work, this analysis generated a list of CNVs which were either absent in the normal populations we studied or else present at significantly higher frequency in the LOAD cohort. Such CNVs are routinely annotated to determine if they overlie known genes and/or regulatory regions. As an example, we have discovered a deletion in 3% of our LOAD cases, which is present in
Non-Technical Research Use Statement:
Most of the common conditions that affect large numbers of the general population have a genetic basis. While progress has been rapid in the field of cancer, the same cannot be said for common, non-cancer, conditions, such as Late-Onset Alzheimer's Disease (LOAD). It is pretty clear now that not all cases of LOAD represent the same disease, in terms of what is the cause. Our approach has been to consider common diseases as collections of rare subgroups, each of which has a specific cause and which, in due course, will have a specific treatment. We have pioneered and implemented a method to rapidly uncover potentially causal genes in common disorders and will use the data generated from this study to strengthen our discoveries, by validating a set of novel candidate genes we have identified in LOAD Our project will allow us to: 1.Define subsets of disease 2.Work with pharmaceutical companies to develop drugs that will specifically target each subset of disease. In some cases, disease progression may be halted by the therapies developed. In some cases, reversal and/or cure may be possible
Investigator:
Roussos, Panagiotis
Institution:
Icahn School of Medicine at Mount Sinai
Project Title:
Higher Order Chromatin and Genetic Risk for Alzheimer's Disease
Date of Approval:
November 21, 2024
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
Alzheimer's disease (AD) is the most common form of dementia and is characterized by cognitive impairment and progressive neurodegeneration. Genome-wide association studies of AD have identified more than 70 risk loci; however, a major challenge in the field is that the majority of these risk factors are harbored within non-coding regions where their impact on AD pathogenesis has been difficult to establish. Therefore, the molecular basis of AD development and progression remains elusive and, so far, reliable treatments have not been found. The overarching goal of this proposal is to examine and validate AD-related changes on chromatin accessibility and the 3D genome at the single cell level. Based on recent data from our group and others, we hypothesize that genotype-phenotype associations in AD are causally mediated by cell type-specific alterations in the regulatory mechanisms of gene expression. To test our hypothesis, we propose the following Specific Aims: (1) perform multimodal (i.e., within cell) profiling of the chromatin accessibility and transcriptome at the single cell level to identify cell type-specific AD-related changes on the 3D genome; (2) fine-map AD risk loci to identify causal variants, regulatory regions and genes; (3) functionally validate putative causal variants and regulatory sequences using novel approaches that combine massively parallel reporter assays, CRISPR and single cell assays in neurons and microglia derived from induced pluripotent stem cells; and (4) develop and maintain a community workspace that provides for the rapid dissemination and open evaluation of data, analyses, and outcomes. Overall, our multidisciplinary computational and experimental approach will provide a compendium of functionally and causally validated AD risk loci that has the potential to lead to new insights and avenues for therapeutic development.
Non-Technical Research Use Statement:
Alzheimer’s disease (AD) affects half the US population over the age of 85 and despite decades of research, reliable treatments for AD have not been found. The overarching goal of our proposal is to generate multiscale genomics (gene expression and epigenome regulation) data at the single cell level and perform fine mapping to detect and validate causal variants, transcripts and regulatory sequences in AD. The proposed work will bridge the gap in understanding the link among the effects of risk variants on enhancer activity and transcript expression, thus illuminating AD molecular mechanisms and providing new targets for future therapeutic development.
Investigator:
Wainberg, Michael
Institution:
Sinai Health System
Project Title:
Uncovering the causal genetic variants, genes and cell types underlying brain disorders
Date of Approval:
September 5, 2024
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
We propose a multifaceted approach to elucidate and interpret genetic risk factors for Alzheimer's disease. First, we propose to perform a whole-genome sequencing meta-analysis of the Alzheimer's Disease Sequencing Project with the UK Biobank and All of Us to associate rare coding and non-coding variants with Alzheimer's disease and related dementias. We will explore a variety of case definitions in the UK Biobank and All of Us, including those based on ICD codes from electronic medical records (inpatient, primary care and/or death), self-report of Alzheimer's disease or Alzheimer's disease and related dementias, and/or family history of Alzheimer's disease or Alzheimer's disease and related dementias. We will perform single-variant, coding-variant burden, and non-coding variant burden tests using the REGENIE genome-wide association study toolkit.Second, we propose to develop statistical and machine learning models that can effectively infer (“fine-map”) the causal gene(s), variant(s), and cell type(s) underlying each association we find, as well as associations from existing genome-wide association studies and other Alzheimer's- and aging-related cohorts found in NIAGADS. In particular, we propose to improve causal gene identification by incorporating knowledge of gene function as a complement to functional genomics. For instance, we plan to develop improved methods for inferring biological networks, particularly from single-cell data, and integrate these networks with the results of the non-coding associations from our first aim to fine-map causal genes. To fine-map causal variants and cell types, we plan to integrate the associations from our first aim with single-nucleus chromatin accessibility data from postmortem brain cohorts to simultaneously infer which variant(s) are causal for each discovered locus and which cell type(s) they act through.
Non-Technical Research Use Statement:
We have a comprehensive plan to understand and explain the genetic factors that contribute to Alzheimer's disease. Our approach involves two main steps.First, we'll analyze genetic information from large research databases to identify rare genetic changes associated with Alzheimer's and related memory disorders. We'll look at both specific changes in genes and other parts of the genetic code. We'll use data from different studies and combine them to get a clearer picture.Second, we'll create advanced computer models that can help us figure out which specific genes, genetic changes, and cell types are responsible for these associations. This will help us pinpoint the most important factors contributing to Alzheimer's disease. We'll also analyze data from previous studies to build a more complete understanding of these genetic links.
Investigator:
Zhao, Zhongming
Institution:
University of Texas Health Science Center at Houston
Project Title:
AIM-AI: an Actionable, Integrated and Multiscale genetic map of Alzheimer's disease via deep learning
Date of Approval:
March 27, 2025
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
Objectives: The objective of our study is to advance our understanding of the genetic basis of Alzheimer’s Disease (AD) through the analysis of comprehensive genomic datasets such as Whole Exome Sequencing (WES), Whole Genome Sequencing (WGS), single-nuclei RNA sequencing, and Genome-Wide Association Studies (GWAS), as well as the related phenotype. We aim to identify genetic variants that are integral to the development and progression of AD.Study Design: Our approach involves a detailed multi-omics analysis focusing on both coding and non-coding regions within these datasets. We will develop new analytical variables from existing data, ensuring that our research adheres to the established data use limitations and contributes meaningfully to the field of genetic research in AD.Analysis Plan: The plan centers on investigating the correlation between genetic variants and AD, exploring how these variants influence the disease at a genetic level. We will employ cutting-edge computational methods to analyze interactions between these genetic markers and their potential role in AD pathogenesis. The integration of data from multiple sources will be carefully executed to maintain compliance with data use agreements, emphasizing the scientific exploration of AD.
Non-Technical Research Use Statement:
Our research is dedicated to unraveling the genetic components of Alzheimer’s Disease. By analyzing genetic sequences and variations through various genomic datasets, we seek to deepen the scientific understanding of how these genetic elements contribute to AD. The outcomes of this study will be shared with the public, enhancing general knowledge of Alzheimer’s Disease and supporting the global research community in its ongoing efforts to decode this complex condition.
Investigator:
Zhou, Weichen
Institution:
University of Michigan
Project Title:
Explore the functional impact of transposable elements in Alzheimer’s disease and related dementias
Date of Approval:
May 9, 2024
Request status:
Expired
Research use statements:
Show statements
Technical Research Use Statement:
Explore somatic transposable elements and their Alzheimer's disease-related patterns using genomic and phenotypic data from large cohorts:In order to explore the impact of the transposable element in Alzheimer's disease, we propose to conduct a systematic survey in the available large cohorts. The ADSP dataset in NIAGAlzheimer's diseaseS (Accession No. NG00067) includes 16,906 whole-genome sequences and 20,504 whole-exome sequences for case-control and family-based studies of Alzheimer's disease from diverse populations, which is a perfect resource to leverage in this project. Under the support of the Michigan Alzheimer's Disease Center, we will request to access NIAGADS. To detect somatic transposable elements in the ADSP dataset, we will employ established computational pipelines to resolve the transposable elements in the sequencing data, MELT and xTEA for WGS and SCRAMble for WES, respectively. Parameters in these tools, for instance, the calling threshold of supporting reads, will be adjusted accordingly to cooperate with the detection of somatic transposable elements in cells at low frequency. To exclude potential germline transposable elements, we will leverage a master set of polymorphic transposable elements from diverse populations, which are based on our previous projects at the Human Genome Structural Variation Consortium, and the case-control information provided by ADSP. We aim to summarize a spectrum of somatic transposable elements that would be Alzheimer's disease-relevant along with various clinical and phenotypic information. To build Alzheimer's disease-related genetic patterns we will implement Mutect2 (GATK) and Strelka2 to discover SNVs from WGS and WES data and link them with transposable elements in the same haplotype. After obtaining this set of patterns, we will collect phenotypic information from the ADSP dataset to conduct family-based associated analysis and gene-burden analysis. RegulomeDB will be used to annotate the effects of non-coding functional impact and regulatory changes for these Alzheimer's disease-related patterns.
Non-Technical Research Use Statement:
It seeks to explore the connection between the somatic transposable elements in the human genome and Alzheimer’s disease and related dementias. It will leverage large-scale datasets to extensively explore the genome-wide transposable elements and then stratify Alzheimer’s disease-relevant ones by using the rich clinical information from the cohorts. Further analysis pipelines will be built based on the results of the proposed project to investigate the functional impact of these transposable elements on Alzheimer’s disease and would improve the understanding of genetic causes of Alzheimer’s disease and related dementias.

Total number of samples: 170

Female 105 61.8 %

Male 65 38.2 %

AD
Control	1	0.6%
Case	123	72.4%
Other	46	27.1%

NG00129 – NCRAD Families WES

Overview

Description

Sample Summary per Data Type

Available Filesets

Sample information

Data Releases

Related Studies

Cohorts

Consent Levels

Acknowledgement

Acknowledgment statement for any data distributed by NIAGADS:

For investigators using any data from this dataset:

For investigators using NCRAD Family Study (sa000025) data:

Approved Users

Total number of samples: 170