Overview
To access the full dataset, please log into DSS and submit an application.
Within the application, add this dataset (accession NG00169) in the “Choose a Dataset” section.
Once approved, you will be able to log in and access the data within the DARM portal.
The p-value only files are available in the “Open Access Dataset” tab.
Description
For a majority of the cases included in the study, inclusion criteria were a neuropathological diagnosis of Progressive Supranuclear Palsy (PSP; n=2,595), with the exception of a small number of cases, both living and deceased, that only had a neurological diagnosis (n=184). PSP subjects with comorbid pathological features of other neurodegenerative disorders were not excluded from the study including AD-like features, Lewy bodies, and TDP-43 as prevalence of these comorbid features. The controls had no clinical evidence of cognitive impairment or a movement disorder (n=5,584) and neuropathologically could only have age-related pathological changes. A full list of the institutions where the material was collected can be found in our full text publication and it should be noted many of the samples included here were contained in previous studies (Höglinger et. al., 2011; Chen et. al, 2018; Sanchez-Contreras et. al., 2018).
PSP cases and controls were genotyped at three different institutions (University of Pennsylvania, Icahn School of Medicine at Mount Sinai, and the University of California Los Angeles) on three genotyping platforms (Illumina Human660W, Illumina OmniExpress 2.5, and Illumina Global Screening Array) in 10 total batches. The cases and controls were genotyped at each of the respective institutions, merged, and harmonized to contain the same variants and single nucleotide polymorphism (SNP) and sample level quality control was performed followed by imputation. The process was repeated by combining the data from the three centers and the overlapping variants were again harmonized.
PLINK v1.9 was used to perform quality control. SNP exclusion criteria included minor allele frequency < 1%, genotyping call-rate filter less than 95%, and Hardy–Weinberg threshold of 1 × 10−6. Individuals with discordant sex, non-European ancestry, genotyping failure of > 5%, or relatedness of > 0.1 were excluded. A principal component analysis (PCA) was performed to identify population substructure using EIGENSTRAT v6.1.4 and the 1000 genomes reference panel. Samples were excluded if they were six standard deviations away from the European population cluster. Each dataset was imputed on the Trans-Omics for Precision Medicine (TOPMed) Imputation Server (TIS) using the multi-ancestry release 5 (R5) reference panel which includes data on from 97,256 participants with 308,107,085 SNPs observed on 194,512 haplotypes.
Phasing was performed on the TIS using EAGLE with subsequent imputation using Minimac. Imputed variants were filtered using a conservative quality threshold, R 2≥0.8, to assure high quality of variants, and additional filtering on variants overlapping all genotype sets with MAF>0.01 was performed prior to analysis. Single-variant genome-wide association analyses was performed jointly on all imputed datasets using a score-based logistic regression under an additive model with covariate adjustment for sex, the first three PC eigenvectors for population substructure, and indicator variables for genotyping platform to mitigate potential batch effects. All association analyses were performed using the program SNPTEST 63. After analysis, variants with regression coefficient of |β|>5 and any erroneous estimates (negative standard errors or P-values equal to 0 or 1) were excluded from further analysis.
Available Filesets
Name | Accession | Latest Release | Description |
---|---|---|---|
PSP Summary Statistics - 2024: Full Summary Statistics (application needed) | fsa000111 | NG00169.v1 | Full Summary Statistics |
PSP Summary Statistics - 2024: P-values only (open access) | fsa000112 | NG00169.v1 | p-values Only |
View the File Manifest for a full list of files released in this dataset.