To access this data, please log into DSS and submit an application.
Within the application, add this dataset (accession NG00129) in the “Choose a Dataset” section.
Once approved, you will be able to log in and access the data within the DARM portal.

Description

This ADSP release, containing 170 whole-exomes includes 1) sequencing read alignments in CRAM (compressed BAM) format for the newly sequenced 170 samples, (2) genomic Variant Call Format (gVCF) files generated by GATK4.1.1 on samples.
The pVCF released here is provided as a preview to the formal ADSP quality control that will be released in a few months. Checks of the dataset are ongoing, and the released files may be subject to change in the full quality-controlled release.

Sample Summary per Data Type

Sample SetAccessionData TypeNumber of Samples
NCRAD Families WESsnd10039WES170

Available Filesets

FilesetAccessionLatest ReleaseDescription
NCRAD Families WES CRAMs and GATK gVCFsfsa000037NG00129.v1WES CRAMs and GATK gVCFs. Sequencing Data Quality Control Metrics
NCRAD Families Phenotype and Manifest Filesfsa000038NG00129.v1Phenotypes for sequenced subjects and connecting family members
NCRAD Families WES Project Level VCFfsa000039NG00129.v1Preview pVCF

View the File Manifest for a full list of files released in this dataset.

This dataset includes WES data on 170 sequenced subjects. Phenotype data also includes connecting family data for the sequenced subjects.

Sample SetAccessionNumber of Subjects
NCRAD Families WESsnd10039170

This release includes whole exome CRAMs, gVCFs, preview pVCFs, and phenotypes for 170 NCRAD Families samples.

NCRAD Families Whole Exome Sequencing (WES) Preview
This ADSP release, containing 170 whole-exomes includes 1) sequencing read alignments in CRAM (compressed BAM) format for the newly sequenced 170 samples, (2) genomic Variant Call Format (gVCF) files generated by GATK4.1.1 on samples.
The pVCF released here is provided as a preview to the formal ADSP quality control that will be released in a few months. Checks of the dataset are ongoing, and the released files may be subject to change in the full quality-controlled release.

Consent LevelNumber of Subjects
GRU-IRB-PUB170

Visit the Data Use Limitations page for definitions of the consent levels above.

Total number of approved DARs: 3
  • Investigator:
    Hatchwell, Eli
    Institution:
    Population Bio
    Project Title:
    Mutational Spectrum of Causal Genes for Neurological/Neurodegenerative Diseases and Endometriosis Identified via High Resolution Genome Wide Copy Number Analysis
    Date of Approval:
    September 7, 2023
    Request status:
    Approved
    Research use statements:
    Show statements
    Technical Research Use Statement:
    While single gene rare variants have been shown to play a significant role in Early-Onset Alzheimer’s Disease (EOAD), their role in Late-Onset (LOAD) has not been emphasised. The gene discovery methodology we have developed at Population Bio allows for unbiased exploration of highly informative genomic variants in any cohort of interest. Our approach is based on ultra-high resolution copy number variant (CNV) analysis. We have invested heavily in such analysis on normal populations. These are used as comparators for cohorts of interest, such as LOAD. In our LOAD work, this analysis generated a list of CNVs which were either absent in the normal populations we studied or else present at significantly higher frequency in the LOAD cohort. Such CNVs are routinely annotated to determine if they overlie known genes and/or regulatory regions. As an example, we have discovered a deletion in 3% of our LOAD cases, which is present in <= 1% of normals. This deletion disrupts a transcription factor binding site in the intron of a gene, which, via GeneHancer, is known to control exon 1 of the gene. The gene in question is novel to LOAD, and is an important metabolic gene, with known biology. It is vital that we validate this finding by analysis of independent LOAD datasets. In addition, we wish to validate other genes discovered in the same manner We have very deep experience of analyzing WGS/WES datasets. Our focus will be to pull out of the available WGS/WES datasets all the variants for the candidate genes of interest. Such variants, including SNVs, indels and CNVs (called using a variety of tools we have experience with) will be analyzed by reference to databases of normal individuals: i.CNVs, by reference to our own internal database but also gnomad (https://gnomad.broadinstitute.org) CNV data and DGV (http://dgv.tcag.ca) ii.SNVs/indels, by reference to gnomad These analyses will allow us to determine whether there exists a mutational burden for our candidate genes of interest in independent LOAD cohorts, and will serve as validation/refutation. The main phenotype of interest will be definitive diagnoses of LOAD, based on neuropathological and clinical cognitive analyses
    Non-Technical Research Use Statement:
    Most of the common conditions that affect large numbers of the general population have a genetic basis. While progress has been rapid in the field of cancer, the same cannot be said for common, non-cancer, conditions, such as Late-Onset Alzheimer's Disease (LOAD). It is pretty clear now that not all cases of LOAD represent the same disease, in terms of what is the cause. Our approach has been to consider common diseases as collections of rare subgroups, each of which has a specific cause and which, in due course, will have a specific treatment. We have pioneered and implemented a method to rapidly uncover potentially causal genes in common disorders and will use the data generated from this study to strengthen our discoveries, by validating a set of novel candidate genes we have identified in LOAD Our project will allow us to: 1.Define subsets of disease 2.Work with pharmaceutical companies to develop drugs that will specifically target each subset of disease. In some cases, disease progression may be halted by the therapies developed. In some cases, reversal and/or cure may be possible
  • Investigator:
    Roussos, Panagiotis
    Institution:
    Icahn School of Medicine at Mount Sinai
    Project Title:
    Higher Order Chromatin and Genetic Risk for Alzheimer's Disease
    Date of Approval:
    August 16, 2023
    Request status:
    Approved
    Research use statements:
    Show statements
    Technical Research Use Statement:
    Alzheimer's disease (AD) is the most common form of dementia and is characterized by cognitive impairment and progressive neurodegeneration. Genome-wide association studies of AD have identified more than 70 risk loci; however, a major challenge in the field is that the majority of these risk factors are harbored within non-coding regions where their impact on AD pathogenesis has been difficult to establish. Therefore, the molecular basis of AD development and progression remains elusive and, so far, reliable treatments have not been found. The overarching goal of this proposal is to examine and validate AD-related changes on chromatin accessibility and the 3D genome at the single cell level. Based on recent data from our group and others, we hypothesize that genotype-phenotype associations in AD are causally mediated by cell type-specific alterations in the regulatory mechanisms of gene expression. To test our hypothesis, we propose the following Specific Aims: (1) perform multimodal (i.e., within cell) profiling of the chromatin accessibility and transcriptome at the single cell level to identify cell type-specific AD-related changes on the 3D genome; (2) fine-map AD risk loci to identify causal variants, regulatory regions and genes; (3) functionally validate putative causal variants and regulatory sequences using novel approaches that combine massively parallel reporter assays, CRISPR and single cell assays in neurons and microglia derived from induced pluripotent stem cells; and (4) develop and maintain a community workspace that provides for the rapid dissemination and open evaluation of data, analyses, and outcomes. Overall, our multidisciplinary computational and experimental approach will provide a compendium of functionally and causally validated AD risk loci that has the potential to lead to new insights and avenues for therapeutic development.
    Non-Technical Research Use Statement:
    Alzheimer’s disease (AD) affects half the US population over the age of 85 and despite decades of research, reliable treatments for AD have not been found. The overarching goal of our proposal is to generate multiscale genomics (gene expression and epigenome regulation) data at the single cell level and perform fine mapping to detect and validate causal variants, transcripts and regulatory sequences in AD. The proposed work will bridge the gap in understanding the link among the effects of risk variants on enhancer activity and transcript expression, thus illuminating AD molecular mechanisms and providing new targets for future therapeutic development.
  • Investigator:
    Wainberg, Michael
    Institution:
    Sinai Health System
    Project Title:
    Uncovering the causal genetic variants, genes and cell types underlying brain disorders
    Date of Approval:
    April 3, 2024
    Request status:
    Approved
    Research use statements:
    Show statements
    Technical Research Use Statement:
    We propose a multifaceted approach to elucidate and interpret genetic risk factors for Alzheimer's disease. First, we propose to perform a whole-genome sequencing meta-analysis of the Alzheimer's Disease Sequencing Project with the UK Biobank and All of Us to associate rare coding and non-coding variants with Alzheimer's disease and related dementias. We will explore a variety of case definitions in the UK Biobank and All of Us, including those based on ICD codes from electronic medical records (inpatient, primary care and/or death), self-report of Alzheimer's disease or Alzheimer's disease and related dementias, and/or family history of Alzheimer's disease or Alzheimer's disease and related dementias. We will perform single-variant, coding-variant burden, and non-coding variant burden tests using the REGENIE genome-wide association study toolkit.Second, we propose to develop statistical and machine learning models that can effectively infer (“fine-map”) the causal gene(s), variant(s), and cell type(s) underlying each association we find, as well as associations from existing genome-wide association studies and other Alzheimer's- and aging-related cohorts found in NIAGADS. In particular, we propose to improve causal gene identification by incorporating knowledge of gene function as a complement to functional genomics. For instance, we plan to develop improved methods for inferring biological networks, particularly from single-cell data, and integrate these networks with the results of the non-coding associations from our first aim to fine-map causal genes. To fine-map causal variants and cell types, we plan to integrate the associations from our first aim with single-nucleus chromatin accessibility data from postmortem brain cohorts to simultaneously infer which variant(s) are causal for each discovered locus and which cell type(s) they act through.
    Non-Technical Research Use Statement:
    We have a comprehensive plan to understand and explain the genetic factors that contribute to Alzheimer's disease. Our approach involves two main steps.First, we'll analyze genetic information from large research databases to identify rare genetic changes associated with Alzheimer's and related memory disorders. We'll look at both specific changes in genes and other parts of the genetic code. We'll use data from different studies and combine them to get a clearer picture.Second, we'll create advanced computer models that can help us figure out which specific genes, genetic changes, and cell types are responsible for these associations. This will help us pinpoint the most important factors contributing to Alzheimer's disease. We'll also analyze data from previous studies to build a more complete understanding of these genetic links.

Acknowledgment statement for any data distributed by NIAGADS:

Data for this study were prepared, archived, and distributed by the National Institute on Aging Alzheimer’s Disease Data Storage Site (NIAGADS) at the University of Pennsylvania (U24-AG041689), funded by the National Institute on Aging.

Use the study-specific acknowledgement statements below (as applicable):

For investigators using any data from this dataset:

Please cite/reference the use of NIAGADS data by including the accession NG00129.

For investigators using NCRAD Family Study (sa000025) data:

Samples from the National Centralized Repository for Alzheimer’s Disease and Related Dementias (NCRAD), which receives government support under a cooperative agreement grant (U24AG021886) awarded by the National Institute on Aging (NIA), were used in this study. We thank the participants and their families, whose help and participation made this work possible.

For use of data in ng00117: Quality control procedures and data preparation on the GWAS was conducted by the Alzheimer’s Disease Genetics Consortium (ADGC) (UO1AG032984) and the NIA Genetics of Alzheimer’s Disease Storage Site (NIAGADS) (U24-AG041689), both funded by NIA.

For use of data in ng00129: Data processing and quality control procedures on the whole-exome dataset was conducted by the Genome Center for Alzheimer’s Disease (GCAD) (U54AG052427) and the NIA Genetics of Alzheimer’s Disease Storage Site (NIAGADS) (U24-AG041689), both funded by NIA.