Description

This dataset includes sequencing data and harmonized phenotypes from cohorts sequenced by the Alzheimer’s Disease Sequencing Project and other AD and Related Dementia’s studies. Samples are processed using a common workflow called VCPA (Variant Calling Pipeline and data management tool), a functionally equivalent CCDG/TOPMed pipeline.

Latest Release:

  • The fourteenth release (October 3, 2024) includes 1) R2 WES chrY/M preview pVCFs, 2) R3 WGS CHARGE-generated PCs, and 3) R4 QC’d bi-allelic pVCFs in GDS format, bi-allelic chrX QC’d pVCF, multi-allelic autosomal and chrX QC’d pVCFs, GraphTyper joint-genotyping SV pVCF, and FAVOR annotations.

For additional information about past releases, see the “Data Releases” tab to the left.

Sample Summary per Data Type

Sample SetAccessionCRAMs/gVCFsStructural Variant (SV) gVCFsSNV/Indel pVCFSV pVCF
ADSP Discovery - WGSsnd10000n = 580n = 580n = 580n = 580
ADSP Discovery - WESsnd10000n = 10655n = 10655n = 10655NA
ADSP Extension - WGSsnd10001n = 3399n = 3399n = 3399n = 3399
ADNI-WGS-1snd10002n = 809n = 809n = 809n = 809
ADGC AA - WES snd10003n = 3157n = 3157n = 3157NA
FASe Families - WESsnd10004n = 1100n = 1100n = 1100NA
Brkanac Families - WESsnd10005n = 75n = 75n = 75NA
Miami Families - WESsnd10006n = 108n = 108n = 108NA
Columbia WHICAP - WESsnd10007n = 3858n = 3858n = 3858NA
Knight ADRC - WESsnd10008n = 650n = 650n = 650NA
CBD - WESsnd10009n = 346n = 346n = 346NA
PSP - WESsnd10010n = 550n = 550n = 550NA
AMP-AD - WGSsnd10011n = 1326n = 1326n = 1326n=1326
UPitt-Kamboh1 - WGSsnd10012n = 209n = 209n = 209n=209
NACC-Genentech - WGSsnd10013n = 137n = 137n = 137n=137
Cache County - WGSsnd10014n = 207n = 207n = 207n=207
PSP-NIH-CurePSP-Tau - WGSsnd10015n = 617n = 617n = 617n=617
PSP-CurePSP-Tau - WGSsnd10016n = 886n = 886n = 886n=886
PSP UCLA - WGSsnd10017n = 408n = 408n = 408n=408
FASe Families - WGSsnd10018n = 91n = 91n = 91n=91
Knight ADRC - WGSsnd10019n = 77n = 77n = 77n=77
ADSP-FUS1 - WGSsnd10020n = 8159n = 8159n = 8159n=8159
ADGC-TARCC - WGSsnd10030n = 1017n = 1017n - 1017NA
ADSP-FUS2 - WGSsnd10031n = 12612n = 12612n = 12612NA
EOAD1 - WGSsnd10032n = 3132n = 3132n = 3132NA
LASI-DAD - WGSsnd10033n = 2686n = 2686n = 2686NA

Available Filesets

NameAccessionLatest ReleaseDescription/What’s New
R1 5K, R3 17K , and R4 36K WGS CRAMs/GATK gVCFs and VCF Structural Variant (SV) callsfsa000001NG00067.v10Mapped to GRCh38. Updated with R4 sequencing files.
WGS QC Metricsfsa000001NG00067.v10Sequencing Data Quality Control Metrics
Phenotypes/Pedigreesfsa000002NG00067.v11Phenotypes and Pedigree structures for all sequenced subjects
R1 5K WGS Project Level VCFfsa000003NG00067.v2ADSP quality control checked GATK joint called VCF containing 4,788 whole-genomes.
R2 20K WES CRAMs/GATK gVCFsfsa000004NG00067.v3Mapped to GRCh38
WES QC Metricsfsa000004NG00067.v3Sequencing Data Quality Control Metrics
R2 20K WES Project Level VCFfsa000005NG00067.v7ADSP quality control checked GATK joint called VCF containing 20503 whole-exomes
R3 17K WGS Project Level VCFfsa000006NG00067.v7Preview and quality-controlled joint called VCF containing 16,905 whole-genomes.
ADSP R3 17K WGS BioGraph SV Callsfsa000022NG00067.v8SV joint genotyping pVCF containing SVs called by Biograph on 16,841 samples in the ADSP 17k WGS (R3) dataset
GCAD R3 17K and R4 36K WGS GraphTyper SV callsfsa000023NG00067.v8SV joint genotyping pVCF containing the merged Manta and Smoove callsets from the ADSP “17k” WGS (R3) and "36k" WGS (R4) datasets
R4 36K WGS Project Level VCFfsa000026NG00067.v11Preview and QC joint called VCF containing 36,361 whole-genomes.
ADSP PHC Harmonized Phenotypesfsa000027NG00067.v11Harmonized phenotypes for Cognitive, Fluid 8 domains for a subset of cohorts
R4 36K WGS Annotationfsa000068NG00067.v11R4 36K WGS Annotation Files

View the File Manifest for a full list of files released in this dataset.

R2 WES Target Regions

Download a copy of the R2 WES target regions: gcad.wes.20650.VCPA1.1.2019.11.01.targetregions.zip

VariXam- Variant Browser

VariXam is an aggregated database and a variant browser that shows genomic variants detected on whole-genome/whole-exome sequence (WGS/WES) data from the ADSP. Browse variants from the R1, R2, R3, and R4 joint genotype called VCFs through the open access browser: https://varixam.niagads.org/