The initial phase of the ADSP research plan is called the Discovery Phase. Samples were selected from well-characterized study cohorts of individuals with or without an AD diagnosis and the presence or absence of known risk factor genes. The ADSP generated three sets of genome sequence data for these samples as part of the Discovery Phase: (1) WGS for 584 samples from 113 multiplex families (two or more affected per family), (2) Whole Exome Sequence (WES) for 5,096 AD cases and 4,965 controls, and (3) WES of an Enriched sample set comprised of 853 AD cases from multiply affected families and 171 Hispanic controls. The Case-Control and Enriched Case Study spans 24 cohorts provided by the Alzheimer’s Disease Genetics Consortium (ADGC) and the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium.

Sequencing for these samples was conducted through three National Human Genome Research Institute (NHGRI) funded Large Scale Sequencing and Analysis Centers (LSACs): Baylor College of Medicine Human Genome Sequencing Center, the Broad Institute, the McDonnell Genome Institute at Washington University. The samples were sequenced on the Illumina HiSeq 2000/2500 platforms with 100bp paired-end reads. In the ADSP Discovery Case Control, 4586 samples were sequenced using the Illumina Rapid Capture Exome (ICE) kit and 6343 samples were sequenced using Roche Nimblegen’s VCRome v2.1 target capture kit. BAM files from hg37 build were sent to GCAD for processing on the VCPA1.1 pipeline. 10634 passed sequencing metrics and quality control.