Description

This dataset includes sequencing data and harmonized phenotypes from cohorts sequenced by the Alzheimer’s Disease Sequencing Project and other AD and Related Dementia’s studies. Samples are processed using a common workflow called VCPA (Variant Calling Pipeline and data management tool), a functionally equivalent CCDG/TOPMed pipeline.

Latest Release:

  • The twenty-first release, ng00067.v20 (March 12, 2026), includes 1) an expansion of the ADSP R5 WGS quality-controlled (QCed) files, including autosomal (by consent) and ChrX pVCFs, Genomic Data Structure (GDS) formatted files, and the replicate analysis, 2) additional ADSP R5 deliverables including jointly-called structural variants, linkage disequilibrium reference panel, and imputation panels, 3) the ADSP-PHC release 4 with new cohorts and harmonized domains, and 4) an update to two previously released files.

For additional information about past releases, see the “Data Releases” tab to the left.

Sample Summary per Data Type

Sample SetAccessionCRAMs/gVCFsStructural Variant (SV) gVCFsSNV/Indel pVCFSV pVCF
ADSP Discovery - WGSsnd10000n = 579n = 579n = 579n = 579
ADSP Discovery - WESsnd10000n = 10,655NAn = 10,655NA
ADSP Extension - WGSsnd10001n = 3,399n = 3,399n = 3,399n = 3,399
ADNI-WGS-1snd10002n = 808n = 808n = 808n = 808
ADGC AA - WES snd10003n = 3,157NAn = 3,157NA
FASe Families - WESsnd10004n = 1,100NAn = 1,100NA
Brkanac Families - WESsnd10005n = 75NAn = 75NA
Miami Families - WESsnd10006n = 108NAn = 108NA
Columbia WHICAP - WESsnd10007n = 3,858NAn = 3,858NA
Knight ADRC - WESsnd10008n = 650NAn = 650NA
CBD - WESsnd10009n = 346NAn = 346NA
PSP - WESsnd10010n = 550NAn = 550NA
AMP-AD - WGSsnd10011n = 1,326n = 1,326n = 1,326n=1,326
UPitt-Kamboh1 - WGSsnd10012n = 209n = 209n = 209n=209
NACC-Genentech - WGSsnd10013n = 137n = 137n = 137n=137
Cache County - WGSsnd10014n = 207n = 207n = 207n=207
PSP-NIH-CurePSP-Tau - WGSsnd10015n = 617n = 617n = 617n=617
PSP-CurePSP-Tau - WGSsnd10016n = 886n = 886n = 886n=886
PSP UCLA - WGSsnd10017n = 408n = 408n = 408n=408
FASe Families - WGSsnd10018n = 91n = 91n = 91n=91
Knight ADRC - WGSsnd10019n = 77n = 77n = 77n=77
ADSP-FUS1 - WGSsnd10020n = 8,159n = 8,159n = 8,159n=8,159
ADGC-TARCC - WGSsnd10030n = 1,017n = 1,017n - 1,017n = 1,107
ADSP-FUS2 - WGSsnd10031n = 12,612n = 12,612n = 12,612n = 2,612
EOAD1 - WGSsnd10032n = 3,131n = 3,131n = 3,131n = 3,131
LASI-DAD - WGSsnd10033n = 2,686n = 2,686n = 2,686n = 2,686
Pitt-Kamboh-2 - WGSsnd10091n = 207n = 207n = 207n = 207
GARD1 - WGSsnd10092n = 2,000n = 2,000n = 2,000n = 2,000
ASPREE1 - WGSsnd10093n = 2,734n = 2,734n = 2,734n = 2,734
ADSP-FUS3 - WGSsnd10094n = 8,285n = 8,285n = 8,285n = 8,285
EOAD2 - WGSsnd10095n = 1,183n = 1,183n = 1,183n = 1,183
Wellderly - WGSsnd10096n = 1,147n = 1,147n = 1,147n = 1,147
AZAPOE - WGSsnd10097n = 88n = 88n = 88n = 88
Amyloid-WU - WGSsnd10098n = 1,070n = 1,070n = 1,070n = 1,070
UAB-ADRC - WGSsnd10099n = 17n = 17n = 17n = 17
Amyloid-Pitt - WGSsnd10100n = 798n = 798n = 798n = 798
FASe-Families2 - WGSsnd10101n = 646n = 646n = 646n = 646
HABS-HD1 - WGSsnd10102n = 1,359n = 1,359n = 1,359n = 1,359
APOEExtremes1 - WGSsnd10103n = 21n = 21n = 21n = 21
EFIGA1 - WGSsnd10104n = 1,373n = 1,373n = 1,373n = 1,373
NIA-AD-FBS1 - WGSsnd10105n = 997n = 997n = 997n = 997
WHICAP1 - WGSsnd10106n = 232n = 232n = 232n = 232

Available Filesets

NameAccessionLatest ReleaseDescription/What’s New
R1 5K, R3 17K , R4 36K, and R5 58K WGS CRAMs/GATK gVCFs and VCF Structural Variant (SV) callsfsa000001NG00067.v14Mapped to GRCh38. Updated with R5 sequencing files
WGS QC Metricsfsa000001NG00067.v14Sequencing Data Quality Control Metrics
Phenotypes/Pedigreesfsa000002NG00067.v19Phenotypes and Pedigree structures for all sequenced subjects
R1 5K WGS Project Level VCFfsa000003NG00067.v2ADSP quality control checked GATK joint called VCF containing 4,788 whole-genomes
R2 20K WES CRAMs/GATK gVCFsfsa000004NG00067.v3Mapped to GRCh38
WES QC Metricsfsa000004NG00067.v3Sequencing Data Quality Control Metrics
R2 20K WES Project Level VCFfsa000005NG00067.v17ADSP quality control checked GATK joint called VCF containing 20503 whole-exomes
R3 17K WGS Project Level VCFfsa000006NG00067.v7Preview and quality-controlled joint called VCF containing 16,905 whole-genomes
ADSP R3 17K WGS BioGraph SV Callsfsa000022NG00067.v8SV joint genotyping pVCF containing SVs called by Biograph on 16,841 samples in the ADSP 17k WGS (R3) dataset
GCAD R3 17K, R4 36K, and R5 58K WGS GraphTyper SV callsfsa000023NG00067.v20SV joint genotyping pVCF containing the merged Manta and Smoove callsets from the ADSP “17k” WGS (R3), "36k" WGS (R4), and "58k" WGS (R5) datasets
R4 36K WGS Project Level VCFfsa000026NG00067.v17Preview and QC joint called VCF containing 36,361 whole-genomes
ADSP PHC Harmonized Phenotypesfsa000027NG00067.v19Harmonized phenotypes of 10 domains for a subset of cohorts
R4 36K WGS Annotation & Reference Panelfsa000068NG00067.v14R4 36K WGS Annotation & LD Reference Panel (open access files)
R5 58K WGS Project Level VCFfsa000116NG00067.v20Preview and QC joint called VCF containing 58,506 whole-genomes
R5 58K Reference Panelfsa000165NG00067.v20R5 58K WGS LD Reference Panel (open access files)
R5 58K Imputation Panelfsa000166NG00067.v20R5 58K WGS LD Imputation Panel

View the File Manifest for a full list of files released in this dataset.

R2 WES Target Regions

Download a copy of the R2 WES target regions: gcad.wes.20650.VCPA1.1.2019.11.01.targetregions.zip

VariXam- Variant Browser

VariXam is an aggregated database and a variant browser that shows genomic variants detected on whole-genome/whole-exome sequence (WGS/WES) data from the ADSP. Browse variants from the R1, R2, R3, R4 and R5 joint genotype called VCFs through the open access browser: https://varixam.niagads.org/