The pQTL summary statistics are available in the “Open Access Dataset” tab.
To access the proteomic data, please log into DSS and submit an application.
Within the application, add this dataset (accession NG00102) in the “Choose a Dataset” section.
Once approved, you will be able to log in and access the data within the DARM portal.


Understanding the tissue-specific genetic controls of protein levels is essential to uncover mechanisms of post-transcriptional gene regulation. We previously generated a genomic atlas of protein levels in three tissues relevant to neurological disorders (brain, cerebrospinal fluid and plasma) by profiling thousands of proteins from participants with and without Alzheimer’s disease. We now enhanced this work by analyzing more proteins (1,300 versus 1,079) and an almost twofold increase in high-quality imputed genetic variants (8.4 million versus 4.4 million) by using TOPMed reference panel. We identified 38 genomic regions associated with 43 proteins in brain, 150 regions associated with 247 proteins in cerebrospinal fluid, and 95 regions associated with 145 proteins in plasma. Compared to our previous study, this study newly identified 12 loci in brain, 30 loci in cerebrospinal fluid, and 22 loci in plasma. cis-pQTLs were more likely to be tissue shared, but trans-pQTLs tended to be tissue specific. Between 48.0% and 76.6% of pQTLs did not co-localize with expression, splicing, DNA methylation or histone acetylation QTLs. Using Mendelian randomization, we nominated proteins implicated in neurological diseases, including Alzheimer’s disease, Parkinson’s disease and stroke. This first multi-tissue study will be instrumental to map signals from genome-wide association studies onto functional genes, to discover pathways and to identify drug targets for neurological diseases.

This dataset is part of the Knight ADRC Collection. Other datasets in this collection can be found at:

Sample Summary per Data Type

Available Filesets

NameAccessionLatest ReleaseDescription
KGAD Proteomics: pQTL summary statistics and protein annotations (open access)fsa000065NG00102.v1pQTL summary statistics and protein annotations
KGAD Proteomics: Proteomics, protein annotations, and QC documents (application needed)fsa000066NG00102.v1Proteomics, protein annotations, and QC documents

View the File Manifest for a full list of files released in this dataset.

Provided in this dataset is a set of multi-tissue proteomic data that underwent a process of quality control measures by the Cruchaga Lab at Washington University in St. Louis, as well as pQTL summary statistics. From 1157 subjects, 1300 protein analytes were measured for 328 brain samples, 869 protein analytes were measured for 770 CSF samples, and 953 protein analytes were measured for 500 plasma samples on the SomaLogic SomaScan 1.3K platform at the Washington University Neurogenomics and Informatics Center.

Consent LevelNumber of Subjects

Visit the Data Use Limitations page for definitions of the consent levels above.

Total number of approved DARs: 1
  • Investigator:
    Cruchaga, Carlos
    Washington University School of Medicine
    Project Title:
    The Familial Alzheimer Sequencing (FASe) Project
    Date of Approval:
    May 9, 2024
    Request status:
    Research use statements:
    Show statements
    Technical Research Use Statement:
    The goal of this study is to identify new genes and mutations that cause or increase risk for Alzheimer disease (AD), as well as protective factors. Individuals and families were selected from the Knight-ADRC (Washington University) and the NIA-LOAD study. Only families with at least three first-degree affected individuals were included. Families with pathogenic variants in the known AD or FTD genes, or in which APOE4 segregated with disease were excluded. At least two cases and one control were selected per family. Cases had an age at onset (AAO) after 65 yo and controls had a larger age at last assessment than the latest AAO within the family. Whole exome (WES) and whole genome sequencing (WGS) was generated for 1,235 individuals (285 families) that together with data from our collaborators and the ADSP family-based cohort (3,449 individuals and 757 families) will provide enough statistical power to identify new genes for AD. Dr. Tanzi (Harvard Medical School) will provide WGS from 400 families from the NIMH Alzheimer disease genetics initiative study. We will perform single variant and gene-based analyses to identify genes and variants that increase risk for disease in AD families. Single variant analysis will consist of a combination of association and segregation analyses. We will run family-based gene-based methods to identify genes that show and overall enrichment of variants in AD cases. We will also look for protective and modifier variants. To do this we will identify families loaded with AD cases, that also include individuals with a high burden of known risk variants but that do not develop the disease (escapees). We will use the sequence data and the family structure to identify variants that segregate with the escapee phenotype. The most promising variants and genes will be replicated in independent datasets (ADSP case-control, ADNI, Knight-ADRC, NIA-LOAD ). We will perform single variant and gene-based analyses to replicate the initial findings, and survival analysis to replicate the protective variants. We will select the most promising variants/genes for functional studies
    Non-Technical Research Use Statement:
    Family-based approaches led to the identification of disease-causing Alzheimer’s Disease (AD) variants in the genes encoding APP, PSEN1 and PSEN2. The identification of these genes led to the A?-cascade hypothesis and to the development of drugs that target this pathway. Recently, we have identified rare coding variants in TREM2, ABCA7, PLD3 and SORL1 with large effect sizes for risk for AD, confirming that rare coding variants play a role in the etiology of AD. In this proposal, we will identify rare risk and protective alleles using sequence data from families densely affected by AD. We hypothesize that these families are enriched for genetic risk factors. We already have sequence data from 695 families (2,462 individuals), that combined with the ADSP and the NIMH dataset will lead to a dataset of more than 1,042 families (4,684 individuals). Our preliminary results support the flexibility of this approach and strongly suggest that protective and risk variants with large effect size will be found, which will lead to a better understanding of the biology of the disease.

Acknowledgment statement for any data distributed by NIAGADS:

Data for this study were prepared, archived, and distributed by the National Institute on Aging Alzheimer’s Disease Data Storage Site (NIAGADS) at the University of Pennsylvania (U24-AG041689), funded by the National Institute on Aging.

Use the study-specific acknowledgement statements below (as applicable):

For investigators using any data from this dataset:

Please cite/reference the use of NIAGADS data by including the accession NG00102.

For investigators using Charles F. and Joanne Knight Alzheimer’s Disease Research Center (sa000008) data:

This work was supported by grants from the National Institutes of Health (R01AG044546, P01AG003991, RF1AG053303, R01AG058501, U01AG058922, RF1AG058501 and R01AG057777). The recruitment and clinical characterization of research participants at Washington University were supported by NIH P50 AG05681, P01 AG03991, and P01 AG026276. This work was supported by access to equipment made possible by the Hope Center for Neurological Disorders, and the Departments of Neurology and Psychiatry at Washington University School of Medicine.

We thank the contributors who collected samples used in this study, as well as patients and their families, whose help and participation made this work possible. This work was supported by access to equipment made possible by the Hope Center for Neurological Disorders, and the Departments of Neurology and Psychiatry at Washington University School of Medicine.

For use of the ADSP-PHC harmonized phenotypes deposited within dataset, ng00067, use the following statement:

The Memory and Aging Project at the Knight-ADRC (Knight-ADRC), supported by NIH grants R01AG064614, R01AG044546, RF1AG053303, RF1AG058501, U01AG058922 and R01AG064877 to Carlos Cruchaga. The recruitment and clinical characterization of research participants at Washington University was supported by NIH grants P30AG066444, P01AG03991, and P01AG026276. Data collection and sharing for this project was supported by NIH grants RF1AG054080, P30AG066462, R01AG064614 and U01AG052410. This work was supported by access to equipment made possible by the Hope Center for Neurological Disorders, the Neurogenomics and Informatics Center (NGI: and the Departments of Neurology and Psychiatry at Washington University School of Medicine.

Yang C, et al. Genomic atlas of the proteome from brain, CSF and plasma prioritizes proteins implicated in neurological disorders. Nat Neurosci. 2021 Sep;24(9):1302-1312. doi: 10.1038/s41593-021-00886-6. PMID: 34239129; PMCID: PMC8521603. PubMed link