Round 5 of the Alzheimer’s Disease Sequencing Project (NG00067v14) has been released! This release contains 22,158 new samples and 17 new cohorts, bringing the total number of genomes in the R5 release to 58,507 genomes from 57 cohorts.

Files include the following:

  • Sequencing read alignments in CRAM (compressed BAM) format and Genomic Variant Call Format (gVCF) files generated by GATK4.1.1 as part of the VCPAv1.1 pipeline.
  • Structural variant call VCF (SV VCF) files generated by Manta and Smoove.

GCAD has implemented a new joint-genotype caller, GLnexus, and is excited to release the first project-level joint-genotype calls for chromosomes 1-22, X, Y, and M for the ADSP in preview project-level VCF format (pVCF). It is recommended that users check the dataset README to see how variant representation differs between GATK and GLnexus.

Existing users with approval will have access to all the new genomes as no new consent levels were required for the R5 release. This release only contains the pVCF with all consent levels. The preview pVCFs split by consent level as well as the compact and compact filtered pVCF are expected to be released in December 2024.

Additionally, in NG00067v14 users will find:

  1. R4 WGS LD Reference Panel
  2. QC’d pVCF in GDS format split by individual consent levels
  3. Phenotypes for the participants with new genomes

To apply for access for NG00067, visit the ADSP umbrella dataset page.