Description

This dataset includes sequencing data from samples sequenced by the Alzhiemer’s Disease Sequencing Project and other AD and Related Dementia’s studies. Samples are processed using a common workflow called VCPA (Variant Calling Pipeline and data management tool), a functionally equivalent CCDG/TOPMed pipeline.

Data Releases:

  1. The first release (July 30, 2018) included CRAMs, gVCFs, and phenotypes for 4,789 whole genomes. These data were called by GCAD using the VCPA1.0 pipeline (version NG00067.v0).
  2. The second release (October 30, 2018) included an ADSP quality controlled project level VCF for the 4,789 whole genomes previously released (version NG00067.v1).
  3. The third release (February 18, 2020) includes CRAMs, gVCFs, and phenotypes for 19,922 whole exomes. These data were called by GCAD using the VCPA1.1 pipeline (version NG00067.v2).
  4. The fourth release (September 24, 2020) includes an additional 582 CRAMs, gVCFs, and phenotypes for newly consented samples, as well as an ADSP quality controlled project level VCF for the 20,504 whole exomes (version NG00067.v3).
  5. The fifth release (November 24, 2020) includes an update to the consent of 104 subjects and the correction of two files pertaining to the 4,789 whole-genome dataset (version NG00067.v4).

Sample Summary per Data Type

Sample SetAccessionCRAMsgVCFsGATK Called Genotypes
ADSP Discovery - WGSsnd10000n = 580n = 580n = 580
ADSP Discovery - WESsnd10000n = 10657n = 10657n = 10657
ADSP Extension - WGSsnd10001n = 3400n = 3400n = 3400
ADNI-WGS-1snd10002n = 809n = 809n = 809
ADGC AA - WES snd10003n = 3157n = 3157n = 3157
FASe Families - WESsnd10004n = 1100n = 1100n = 1100
Brkanac Families - WESsnd10005n = 75n = 75n = 75
Miami Families - WESsnd10006n = 108n = 108n = 108
Columbia WHICAP - WESsnd10007n = 3861n = 3861n = 3861
Knight ADRC - WESsnd10008n = 650n - 650n = 650
CBD - WESsnd10009n = 346n = 346n = 346
PSP - WESsnd10010n = 550n = 550n = 550

Available Filesets

NameAccessionLatest ReleaseDescription/What’s New
WGS CRAMs/GATK gVCFsfsa000001NG00067.v2Mapped to GRCh38
WGS QC Metricsfsa000001NG00067.v2Sequencing Data Quality Control Metrics
Phenotypes/Pedigreesfsa000002NG00067.v3Phenotypes and Pedigree structures for all sequenced subjects
WGS Project Level VCFfsa000003NG00067.v2ADSP quality control checked GATK joint called VCF containing 4789 whole-genomes.
WES CRAMs/GATK gVCFsfsa000004NG00067.v3Mapped to GRCh38
WES QC Metricsfsa000004NG00067.v3Sequencing Data Quality Control Metrics
WES Project Level VCFfsa000005NG00067.v3ADSP quality control checked GATK joint called VCF containing 20504 whole-exomes

View the File Manifest for a full list of files released in this dataset.