Description

This dataset includes sequencing data and harmonized phenotypes from cohorts sequenced by the Alzheimer’s Disease Sequencing Project and other AD and Related Dementia’s studies. Samples are processed using a common workflow called VCPA (Variant Calling Pipeline and data management tool), a functionally equivalent CCDG/TOPMed pipeline.

Latest Release:

  • The sixteenth release, ng00067.v15 (December 19, 2024) includes the third release of ADSP-PHC data. This release includes harmonized phenotypes in select cohorts for cognitive, fluid biomarker, neuropathology, cardiovascular risk factors, and neuroimaging (DTI, FLAIR, PET, D1) domains.

For additional information about past releases, see the “Data Releases” tab to the left.

Sample Summary per Data Type

Sample SetAccessionCRAMs/gVCFsStructural Variant (SV) gVCFsSNV/Indel pVCFSV pVCF
ADSP Discovery - WGSsnd10000n = 579n = 579n = 579n = 579
ADSP Discovery - WESsnd10000n = 10,655NAn = 10,655NA
ADSP Extension - WGSsnd10001n = 3,399n = 3,399n = 3,399n = 3,399
ADNI-WGS-1snd10002n = 808n = 808n = 808n = 808
ADGC AA - WES snd10003n = 3,157NAn = 3,157NA
FASe Families - WESsnd10004n = 1,100NAn = 1,100NA
Brkanac Families - WESsnd10005n = 75NAn = 75NA
Miami Families - WESsnd10006n = 108NAn = 108NA
Columbia WHICAP - WESsnd10007n = 3,858NAn = 3,858NA
Knight ADRC - WESsnd10008n = 650NAn = 650NA
CBD - WESsnd10009n = 346NAn = 346NA
PSP - WESsnd10010n = 550NAn = 550NA
AMP-AD - WGSsnd10011n = 1,326n = 1,326n = 1,326n=1,326
UPitt-Kamboh1 - WGSsnd10012n = 209n = 209n = 209n=209
NACC-Genentech - WGSsnd10013n = 137n = 137n = 137n=137
Cache County - WGSsnd10014n = 207n = 207n = 207n=207
PSP-NIH-CurePSP-Tau - WGSsnd10015n = 617n = 617n = 617n=617
PSP-CurePSP-Tau - WGSsnd10016n = 886n = 886n = 886n=886
PSP UCLA - WGSsnd10017n = 408n = 408n = 408n=408
FASe Families - WGSsnd10018n = 91n = 91n = 91n=91
Knight ADRC - WGSsnd10019n = 77n = 77n = 77n=77
ADSP-FUS1 - WGSsnd10020n = 8,159n = 8,159n = 8,159n=8,159
ADGC-TARCC - WGSsnd10030n = 1,017n = 1,017n - 1,017n = 1,107
ADSP-FUS2 - WGSsnd10031n = 12,612n = 12,612n = 12,612n = 2,612
EOAD1 - WGSsnd10032n = 3,131n = 3,131n = 3,131n = 3,131
LASI-DAD - WGSsnd10033n = 2,686n = 2,686n = 2,686n = 2,686
Pitt-Kamboh-2 - WGSsnd10091n=207n=207n=207NA
GARD1 - WGSsnd10092n = 2,000n = 2,000n = 2,000NA
ASPREE1 - WGSsnd10093n = 2,735n = 2,735n = 2,735NA
ADSP-FUS3 - WGSsnd10094n = 8,285n = 8,285n = 8,285NA
EOAD2 - WGSsnd10095n = 1,183n = 1,183n = 1,183NA
Wellderly - WGSsnd10096n = 1,147n = 1,147n = 1,147NA
AZAPOE - WGSsnd10097n = 88n = 88n = 88NA
Amyloid-WU - WGSsnd10098n = 1,070n = 1,070n = 1,070NA
UAB-ADRC - WGSsnd10099n = 17n = 17n = 17NA
Amyloid-Pitt - WGSsnd10100n = 798n = 798n = 798NA
FASe-Families2 - WGSsnd10101n = 646n = 646n = 646NA
HABS-HD1 - WGSsnd10102n = 1,359n = 1,359n = 1,359NA
APOEExtremes1 - WGSsnd10103n = 21n = 21n = 21NA
EFIGA1 - WGSsnd10104n = 1,373n = 1,373n = 1,373NA
NIA-AD-FBS1 - WGSsnd10105n = 997n = 997n = 997NA
WHICAP1 - WGSsnd10106n = 232n = 232n = 232NA

Available Filesets

NameAccessionLatest ReleaseDescription/What’s New
R1 5K, R3 17K , R4 36K, and R5 58K WGS CRAMs/GATK gVCFs and VCF Structural Variant (SV) callsfsa000001NG00067.v14Mapped to GRCh38. Updated with R5 sequencing files.
WGS QC Metricsfsa000001NG00067.v14Sequencing Data Quality Control Metrics
Phenotypes/Pedigreesfsa000002NG00067.v14Phenotypes and Pedigree structures for all sequenced subjects
R1 5K WGS Project Level VCFfsa000003NG00067.v2ADSP quality control checked GATK joint called VCF containing 4,788 whole-genomes.
R2 20K WES CRAMs/GATK gVCFsfsa000004NG00067.v3Mapped to GRCh38
WES QC Metricsfsa000004NG00067.v3Sequencing Data Quality Control Metrics
R2 20K WES Project Level VCFfsa000005NG00067.v7ADSP quality control checked GATK joint called VCF containing 20503 whole-exomes
R3 17K WGS Project Level VCFfsa000006NG00067.v7Preview and quality-controlled joint called VCF containing 16,905 whole-genomes.
ADSP R3 17K WGS BioGraph SV Callsfsa000022NG00067.v8SV joint genotyping pVCF containing SVs called by Biograph on 16,841 samples in the ADSP 17k WGS (R3) dataset
GCAD R3 17K and R4 36K WGS GraphTyper SV callsfsa000023NG00067.v8SV joint genotyping pVCF containing the merged Manta and Smoove callsets from the ADSP “17k” WGS (R3) and "36k" WGS (R4) datasets
R4 36K WGS Project Level VCFfsa000026NG00067.v14Preview and QC joint called VCF containing 36,361 whole-genomes.
ADSP PHC Harmonized Phenotypesfsa000027NG00067.v15Harmonized phenotypes of 8 domains for a subset of cohorts
R4 36K WGS Annotation & Reference Panelfsa000068NG00067.v14R4 36K WGS Annotation & LD Reference Panel
R5 58K WGS Project Level VCFfsa000116NG00067.v14Preview VCF containing 58,507 whole-genomes.

View the File Manifest for a full list of files released in this dataset.

R2 WES Target Regions

Download a copy of the R2 WES target regions: gcad.wes.20650.VCPA1.1.2019.11.01.targetregions.zip

VariXam- Variant Browser

VariXam is an aggregated database and a variant browser that shows genomic variants detected on whole-genome/whole-exome sequence (WGS/WES) data from the ADSP. Browse variants from the R1, R2, R3, and R4 joint genotype called VCFs through the open access browser: https://varixam.niagads.org/