NG00105 - MiGA – Microglia Genomic Atlas

The summary statistics are available in the “Open Access Dataset” tab.
To access the full data, please log into DSS and submit an application.
Within the application, add this dataset (accession NG00105) in the “Choose a Dataset” section.
Once approved, you will be able to log in and access the data within the DARM portal.

Description

Microglia Genomic Atlas – MiGA (NG00105.v1):
Post-mortem brain samples were obtained from the Netherlands Brain Bank (NBB). The permission to collect human brain material was obtained from the Ethical Committee of the VU University Medical Center, Amsterdam, The Netherlands. Informed consent for autopsy, the use of brain tissue and accompanied clinical information for research purposes was obtained per donor ante-mortem. Neuropathological assessments have been performed by the NBB. Detailed information per donor, including tissue type, age, sex, postmortem interval, pH of cerebrospinal fluid, cause of death, diagnosis, use of medication and neuropathological information is provided.

DNA Genotyping
Samples were genotyped using the Illumina Infinium Global Screening Array (GSA). Genotype imputation was performed for those 90 donors through the Michigan Imputation Server v1.4.1 (Minimac 4) using the 1000 Genomes (Phase 3) v5 (GRCh37) European panel and Eagle v2.4 phasing in quality control and imputation mode with rsq filter set to 0.3. Following imputation, variants were lifted over to the GRCh38 reference to match the RNA-seq data using Picard liftoverVCF and the “b37ToHg38.over.chain.gz” liftover chain file.

RNA extraction and sequencing
RNA was isolated using RNeasy Mini kit (Qiagen) adding the DNase I optional step or as described in detail before (Melief J, et al., 2016). Library preparation was performed at Genewiz using the Ultra-low input system which uses Poly-A selection. SMART-Seq v4 Ultra Low Input RNA Kit was used for library construction using 100 ng of RNA. The libraries were sequenced as 150 bp on fragments with an average read depth of 29 million (ranging from 14-82M) read pairs on the Illumina HiSeq 2500.

RNA-seq data processing
RNA-seq data was processed using the RAPiD pipeline (Wang YC, et al., 2015). RAPiD aligns samples to the hg38 genome build using STAR (Dobin A, et al., 2013) using the GENCODE v30 transcriptome reference and calculates quality control metrics using Picard. RNA-seq quality control was performed applying three filters to remove samples: 1) samples with less than 10M reads aligned from STAR; 2) samples with more than 20% of the reads aligned to ribosomal regions; 3) samples with less than 10% of the reads mapping to coding regions; 4) samples from brain regions with fewer than 20 donors. Estimated transcript abundance was obtained using RSEM (Li B and Dewey CN, 2011) and transcripts were summed to the gene level with tximport (Love MI, et al., 2017). Genes with more than 1 read count per million (CPM) in 30% of the samples were kept for downstream analysis. Gene level read counts were normalized as transcripts per million mapped reads (TPM) to adjust for sequencing library size differences.

Quantitative Trait Loci mapping
To perform expression QTL (eQTL) mapping, we followed the latest pipeline created by the GTEX consortium (Aguet et al. 2019). We completed a separate normalization and filtering method to previous analyses. Gene expression matrices were created from the RSEM output using tximport (Love, Soneson, and Robinson 2017). Matrices were then converted to GCT format, TMM normalized, filtered for lowly expressed genes, removing any gene with less than 0.1 TPM in 20% of samples and at least 6 counts in 20% of samples. Each gene was then inverse-normal transformed across samples. After filtering, we tested a total of 18,430 genes. Then, PEER (Stegle et al. 2012) factors were calculated to estimate hidden confounders within our expression data. We created a combined covariate matrix that included the PEER factors and the first 4 genotyping ancestry MDS values as input to the analysis. We tested numbers of PEER factors from 0 to 20 and found that between 5 and 10 factors produced the largest number of eGenes in each region.
To test for cis-eQTLs, linear regression was performed using the tensorQTL (Taylor-Weiner et al. 2019) cis_nominal mode for each SNP-gene pair using a 1 megabase window within the transcription start site (TSS) of a gene. To test for association between gene expression and the top variant in cis we used tensorQTL cis permutation pass per gene with 1000 permutations. To identify eGenes, we performed q-value correction of the permutation P-values for the top association per gene (Storey 2003) at a threshold of 0.05.
We performed splicing quantitative trait loci (sQTL) analysis using the splice junction read counts generated by regtools (Feng et al. 2018). Junctions were clustered using Leafcutter (Li et al. 2018), specifying for each junction in a cluster a maximum length of 100kb. Following the GTEx pipeline, introns without read counts in at least 50% of samples or with fewer than 10 read counts in at least 10% of samples were removed. Introns with insufficient variability across samples were removed. Filtered counts were then quantile normalized using prepare_phenotype_table.py from Leafcutter, merged, and converted to BED format, using the coordinates from the middle of the intron cluster. We created a combined covariate matrix that included the PEER factors and the first 4 genotyping ancestry MDS values as input to the analysis. We mapped sQTLs with between 0 and 20 PEER factors as covariates in our QTL model and determined 5 to be optimal in MFG, STG and THA. 0 PEER factors were used for SVZ.

To test for cis sQTLs, linear regression was performed using the tensorQTL nominal pass for each SNP-junction pair using a 100kb window from the center of each intron cluster. Although junctions were initially grouped together into clusters, we tested each SNP-junction pair separately, which is the standard approach (Li et al. 2018; Aguet et al. 2019). To test for association between intronic ratio and the top variant in cis we used tensorQTL permutation pass, grouping junctions by their cluster using –grp option. To identify significant clusters, we performed q-value correction using a threshold of 0.05.

Microglia Genomic Atlas after Stimulation– MiGASTi (NG00105.v2):

Microglia Isolation
Brain tissue was stored in Hibernate media (Gibco) at 4 °C upon processing within 24 hours after autopsy. Microglia were isolated from six regions, including medial frontal gyrus (MFG; 43), superior frontal gyrus (SFG; 1 samples), superior temporal gyrus (STG; 34 samples), thalamus (THA; 24 samples), subventricular zone (SVZ; 32 samples) and corpus callosum (CC; 16 samples). In brief, brain tissue was first mechanically dissociated through a metal sieve in a glucose- potassium-sodium buffer (GKN-BSA; 8.0 g/L NaCl, 0.4 g/L KCl, 1.77 g/L Na2HPO4.2H2O, 0.69 g/L NaH2PO4.H2O, 2.0 g/L D-(1)-glucose, 0.3% bovine serum albumin (BSA, Merck, Darmstadt, Germany); pH 7.4) and supplemented with collagenase Type I (3700 units/mL; Worthington, USA) and DNase I (200 µg/mL; Roche, Switzerland) or 2% of Trypsin (Invitrogen) at 37 °C for 30 min or 60 min while shaking. The suspension was put over a 100 µM cell strainer and washed with GKN-BSA buffer in the centrifuge (1800 rpm, slow brake, 4 °C, 10 min) before the pellet was resuspended in 20 mL GKN-BSA buffer. 10 mL of Percoll (Merck, Darmstadt, Germany) was added dropwise and the tissue homogenate was centrifuged at 4000 rpm (fast acceleration, slow brake at 4 °C, 30 min). The middle layer was collected and washed with GKN-BSA buffer, followed by resuspension, and centrifuging in a magnetic-activated cell sorting (MACS) buffer (PBS, 1% heat-inactivated fetal cow serum (FCS), 2 mM EDTA; 1500 rpm, 10 °C, 10 min). Microglia were positively selected with CD11b-conjugated magnetic microbeads (Miltenyi Biotec, Germany) according to the manufacturer’s protocol. This protocol has been validated and the resulting cell viability was between 70 and 98%.

Culturing and stimulation of microglial cells
Microglia were cultured in a poly-L-lysine (PLL; Merck, Germany) coated 96-wells flat bottom plate (Greiner Bio-One, Austria) at a density of 1.0 × 105 cells in a total volume of 200 μL Rosswell-Park-Memorial-Institute medium (RPMI; Gibco Life Technologies, USA) supplemented with 10% FCS, 2 mM L-glutamine (Gibco Life Technologies, USA), 1% penicillin–streptomycin (Gibco Life Technologies, USA) and 100 ng/ml IL-34 (Miltenyi Biotech, Germany). After overnight incubation, pMG were stimulated with 100 ng/mL lipopolysaccharide (LPS) from Escherichia coli 0111:B4 (Merck, Germany), or with 50 ng/ml interferon gamma INF-γ (Peprotech, London, UK), Resiquimod (R848), 50 ng/ml tumor necrosis factor alpha (Sigma-Aldrich, The Netherlands), 40 ng/ml IL-4 (Peprotech, London, UK), 1000 nM dexamethasone (Sigma-Aldrich, The Netherlands), or 1 mM ATP (Sigma-Aldrich, The Netherlands) for 6 h. The concentrations and incubation times are for LPS and IFN-γ are based on dose-response curves and 6h showed most ‘robust’ inflammatory effects. We added the other stimulations as comparison. The cells were harvested with TRIzol reagent and stored at −80 °C for further analysis.

RNA extraction and sequencing protocol
Microglial RNA was isolated using the TRIzol method and cDNA libraries were generated by Genewiz (Azenta) using the Ultra-low input system which uses Poly-A selection. SMART-Seq v4 Ultra Low Input RNA Kit was used for library construction using 100 ng of RNA. The libraries were sequenced as 150 bp on fragments with an average read depth of 28 million (ranging from 0.06-128M) read pairs on the Illumina HiSeq 2500.

RNA-seq data processing
RNA-seq data was processed using the RAPiD pipeline. RAPiD aligns samples to the hg38 genome build using STAR using the GENCODE v30 transcriptome reference and calculates quality control metrics using Picard. RNA-seq quality control was performed applying four filters to remove samples: 1) samples with less than 1 M reads aligned from STAR; 2) samples with more than 20% of the reads aligned to ribosomal regions; 3) samples with less than 5% of the reads mapping to mRNA; 4) samples with high chance of RNA degradation based on gene coverage plot. Estimated transcript abundance was obtained using RSEM and transcripts were summed to the gene level with tximport. Genes with more than 1 read count per million (CPM) in 50% of the samples were kept for downstream analysis. Gene level read counts were normalized as transcripts per million mapped reads (TPM) to adjust for sequencing library size differences. We used several approaches to test the quality of the included samples: 1) the percentage of mitochondrial genes was only 0.87% on average 2) the expression of several apoptotic markers, such as CASP3, BTG1 was low across all stimulations 3) the expression levels of several well-known specific stimulation markers showed up- and downregulation patterns as expected.

Sample Summary per Data Type

Sample Set	Accession	Data Type	Number of Samples
MiGA – Microglia Genomic Atlas	snd10022	1000Genomes Imputed GWAS	n = 90
MiGA – Microglia Genomic Atlas	snd10022	Bulk RNA Sequencing	n = 255
MiGASTi – Microglia Genomic Atlas after Stimulation	snd10082	Short-Read RNA Sequencing	n = 533

Available Filesets

Name	Accession	Latest Release	Description
MiGA – Microglia Genomic Atlas – GWAS Data	fsa000008	NG00105.v1	1000Genomes Imputed GWAS
MiGA – Microglia Genomic Atlas – QTL Summary Statistics (open access)	fsa000009	NG00105.v1	QTL Summary Statistics
MiGA – Microglia Genomic Atlas – RNASeq Data	fsa000010	NG00105.v1	RNASeq BAM files
MiGASTi – Microglia Genomic Atlas after Stimulation – Short-Read RNASeq Data	fsa000105	NG00105.v2	Short-Read RNA FASTQ files

View the File Manifest for a full list of files released in this dataset.

The Microglia Genomic Atlas (MiGA) is a genetic and transcriptomic resource comprised of 255 primary human microglia samples isolated ex vivo from four different brain regions of 100 human subjects with neurodegenerative, neurological, or neuropsychiatric disorders, as well as unaffected controls. We performed systematic analyses to investigate sources of microglial heterogeneity, including brain region, age, and sex. We further performed expression and splicing QTL analyses in each region and performed a meta-analysis across the four regions to increase our discovery power. We then performed colocalization and used fine-mapping and microglia-specific epigenomic data to prioritize genes and variants that influence neurological disease susceptibility through gene expression and splicing in microglia. With this approach, we have built the most comprehensive resource to date of cis genetic effects on the microglial transcriptome and propose underlying molecular mechanisms of potentially causal functional variants in several brain disorders.Human post-mortem brain samples were obtained from the Netherlands Brain Bank (NBB) and the Neuropathology Brain Bank and Research CoRE at Mount Sinai Hospital. The permission to collect human brain material was obtained from the Ethical Committee of the VU University Medical Center, Amsterdam, The Netherlands, and the Mount Sinai Institutional Review Board. For the Netherlands Brain bank, informed consent for autopsy, the use of brain tissue and accompanied clinical information for research purposes was obtained per donor ante-mortem.

Sample Set	Accession Number	Number of Subjects	Number of Samples
MiGA – Microglia Genomic Atlas	snd10022	108	345
MiGASTi – Microglia Genomic Atlas after Stimulation	snd10082	89	533

Consent Level	Number of Subjects
GRU-IRB-PUB	n = 131

Total number of approved DARs: 28

Investigator:
Belloy, Michael
Institution:
Washington University in St Louis
Project Title:
Elucidating sex-specific risk for Alzheimer's disease through state-of-the-art genetics and multi-omics
Date of Approval:
January 6, 2025
Request status:
Expired
Research use statements:
Show statements
Technical Research Use Statement:
• Objectives: In this project, we seek to holistically investigate the genetic and molecular drivers of sex dimorphism in Alzheimer’s disease across ancestries. • Study design: This study integrates large-scale population genetics with multi-omics and endophenotype analyses. We are integrating all data available from ADGC and ADSP, together with other data from AMP-AD and biobanks such as UKB, FinnGen, and MVP to conduct large-scale multi-ancestry GWAS, rare-variant gene aggregation analyses, QTL studies, PWAS, TWAS, etc. We also particularly focus on X chromosome association studies. The study design also interrogates interactions with ancestry, hormone exposures, and with APOE*4, as well as comparisons to non-stratified GWAS/XWAS of Alzheimer’s disease. Further, we will also employ genetic correlation analyses, mendelian randomization, colocalization, and pleiotropy analyses, to interrogate overlap with other complex traits to better understand the mechanisms underlying sex dimorphism in Alzheimer’s disease. • Analysis plan, including the phenotypic characteristics that will be evaluated in association with genetic variants: Our phenotypes will include Alzheimer’s disease risk, conversion risk, various endophenotypes (including amyloid/tau biomarkers, brain imaging metrics, etc.) as well as molecular traits. As noted above, we will conduct large-scale multi-ancestry GWAS, XWAS, rare-variant gene aggregation analyses, QTL studies, PWAS, TWAS, etc. Specific aims include interrogating these question and analyses on (1) the autosomes, (2) the X chromosome, and (3) leveraging sex stratified QTL studies to drive discovery of risk genes.
Non-Technical Research Use Statement:
Alzheimer’s disease (AD) manifests itself differently across men and women, but the genetic and molecular factors that drive this remain elusive. AD is the most common cause of dementia and till today remains largely untreatable. It is thus crucial to study the genetics of AD in a sex-specific manner, as this will help the field gain important insights into disease pathophysiology, identify novel sex-specific risk factors relevant to personalized genetic medicine, and uncover potential new AD drug targets that may benefit both sexes. This project uses large-scale genomics and multi-omics to elucidate novel sex agnostic and sex-specific AD risk genes. We will interrogate sex dimorphism for AD risk on the autosomes and the sex chromosomes. We similarly interrogate sex dimorphism in the genetic regulation of gene expression and protein levels, which we will integrate with genetic risk for Alzheimer’s disease to further discovery risk genes. Throughout, we will also interrogate how sex-specific risk for AD interactions with hormone exposures, ancestry, and the APOE*4 risk allele.
Investigator:
Black, Mary Helen
Institution:
JOHNSON/JOHNSON/PHARM/RES/ DEVELOPMENT
Project Title:
Target identification and validation in Alzheimer’s Disease with Whole-Genome and Whole-Exome Sequence Data
Date of Approval:
April 18, 2022
Request status:
Closed
Research use statements:
Show statements
Technical Research Use Statement:
Alzheimer’s disease (AD) is a common, progressive, neurodegenerative disorder with a strong genetic component with heritability estimates ranging from 58–79% for late-onset AD and over 90% for early onset AD. Genetic association studies are important to highlight key biological mechanisms contributing to the etiology of AD and provide key insights into potential pathways that can ultimately be targeted for future therapeutic development. The objective of this study is to perform a retrospective analysis of genetic data collected from large-scale population-based and case-control cohorts including the UK Biobank, the Alzheimer’s Disease Sequencing Project (ADSP), and FinnGen and integrate them with publicly available multi-omics datasets including, but not limited to, Genotype-Tissue Expression (GTEx), Microglia Genomic Atlas (MiGA), and neuroimaging data to identify novel and existing evidence for genetic determinants of AD. No attempt will be made to try and identify subjects. Aim 1: Identify novel and replicate existing gene associations for AD. We will perform case-control and family-based genetic analyses with AD diagnosis as the outcome of interest. Covariates include age, sex, and principal components. ADSP, UKB, and FinnGen will be analyzed separately and combined with meta-analysis. Biobank cases will be defined using ICD-9/ICD-10 codes, and proxy cases and controls will be carefully defined using questionnaire data on parental history of AD. Both true and proxy cases will be considered to maximize the number of AD cases. Aim 2: Prioritize novel gene associations identified in Aim 1. We will perform genetic fine-mapping and leverage tissue and cell-type specific datasets (e.g. GTEx and MiGA) to prioritize targets for further functional and analytical interrogation. Statistical methods used for target prioritization include colocalization, statistical fine-mapping, and Mendelian randomization. Furthermore, multi-omics-based network approaches will be used to identify disease-related molecular modules and tissue-specific regulatory circuits.
Non-Technical Research Use Statement:
Alzheimer’s disease (AD) is a common, progressive, neurodegenerative disorder with a strong genetic component with heritability estimates ranging from 58–79% for late-onset AD and over 90% for early onset AD. To date, there is only one treatment option intended to mediate the disease progression of AD, while all others treat symptoms associated with AD. Genetic association studies are important to highlight key biological mechanisms contributing to the etiology of AD and provide key insights into potential pathways that can ultimately be targeted for future therapeutic development. The objective of this study is to perform a retrospective analysis of genetic data collected from large-scale population-based and case-control cohorts including the UK Biobank, the Alzheimer’s Disease Sequencing Project (ADSP), and FinnGen and integrate them with publicly available multi-omics datasets including, but not limited to, Genotype-Tissue Expression (GTEx), Microglia Genomic Atlas (MiGA), and neuroimaging data to identify novel and existing evidence for genetic determinants of AD.
Investigator:
Chen, Jingchun
Institution:
University of Nevada, Las Vegas
Project Title:
Classification of Alzheimer’s disease with Genetic Data and Artificial Intelligence
Date of Approval:
November 14, 2025
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
Alzheimer's disease(AD) is the most common cause of dementia, accounting for 60% to 80% of cases that affect over six million people in the United States. The disease gradually progresses from mild cognitive impairment(MCI) to dementia, which takes more than a decade. Identifying individuals who have a high risk of AD earlier is essential for AD prevention and intervention. As the heritability of AD is high(up to 79%), genetic data should be powerful to identify individuals at high risk. Indeed, polygenic risk score (PRS), designed to estimate individual genetic liability by integrating large GWAS summary statistics and individual genotype data, has been shown to be promising for AD risk prediction(AUCs up to 84%). However, the prediction accuracy using a single PRS is still not sufficient for MCI and AD classification in clinical practice. We hypothesize that convolution neural network(CNN) models can improve the classification of AD and MCI by multiple integrating PRSs from multiple traits, multi-omics data (genotyping data, scRNA-seq), clinical data, and imaging data. The objective is to develop advanced AI algorithms and build data-driven models for disease risk assessment, earlier identifying individuals with high risk for MCI and AD. Our long-term goal is to develop and validate a prediction model that can be translated into clinical practice. Our CNN model has recently shown an improved performance for AD with PRSs from multiple traits(AUC 92.4%). We want to extend our approach to predicting AD and MCI in different ethnic groups and validate the results with independent datasets. To this end, we would like to apply for multi-omics data in NG00067.v9 from https://dss.niagads.org/datasets/ng00067/. With an extensive experience in genetic studies on complex disorders and disease modeling, we are confident that we will achieve the specified goals and promote the integration of genetic data with AI algorithms, facilitating data-driven, personalized care of AD. We expect to finish this study within 2 years with publication and grant application. We have IRB approval and will follow the rules for data sharing and acknowledgment.
Non-Technical Research Use Statement:
Alzheimer’s disease (AD), the most common form of dementia, that usually develops from mild cognitive impairment to dementia. There is currently no treatment to slow the progression of this disorder. But earlier identification of the individuals with higher risk maybe critical to prevent the disease. We propose a new approach to create models for classification of AD and MCI with artificial intelligence and genetic data. This study will have a significant value in personalized medicine for AD risk assessment, classification, and earlier intervention.We don’t have the planned collaboration with researchers outside Cleveland Clinic in the current analytic plans.
Investigator:
Cheng, Feixiong
Institution:
Cleveland Clinic
Project Title:
A Multimodal Infrastructure for Alzheimer’s MultiOme Data Repurposing: Artificial Intelligence, Network Medicine, and Therapeutics Discovery
Date of Approval:
September 4, 2025
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
We propose to develop capable and intelligent computer-based toolboxes that enable searching, sharing, visualizing, querying, and analyzing genetics, genomics, multi-omics, and clinical data for AD. The central unifying hypothesis of this project (1U01AG073323-01 [pending for Council meeting at May/2021) is that a genome-wide, multimodal artificial intelligence (AI) framework to identify novel risk genes and networks from human WGS/WES and multi-omics findings will offer drug targets for targeted therapeutic development in AD. Aim 1 will identify rare coding variant-based risk genes using a sequence and structure-based deep learning model. Aim 2 will identify rare non-coding variant-based risk genes using a multiple kernel learning approach. Aim 3 will test whether GWAS common variants linked to AD pathobiology and endophenotypes are enriched in gene regulatory networks in a cell-type specific manner using a Bayesian framework. These analyses will leverage variants from ethnically diverse WGS/WES and clinical data (i.e., imaging, biomarkers, and cognitive measures) from Alzheimer's Disease Sequencing Project (ADSP), and publicly available chromatin interactomic data from NIH RoadMap, FANTOM5, and NIH 4D Nucleome. We will validate our findings using WGS/WES data and protein expression data from our existing cohorts: The Cleveland Clinic Lou Ruvo Center for Brain Health Aging and Neurodegenerative Disease Biobank (CBH-Biobank) and the Cleveland Alzheimer's Disease Research Center (CADRC). We will compile information for clinical data harmonization, including functional imaging, AD biomarkers, and cognitive measures for all integrative analyses. There are no any PHI information will collected or used in the data analysis. We don’t have the planned collaboration with researchers outside Cleveland Clinic in the current analytic plans.
Non-Technical Research Use Statement:
It is estimated that more than 16 million people with AD live in the United States by 2050 and the predisposition to AD involves a complex, polygenic, and pleiotropic genetic architecture. This project will develop intelligent computer-based network medicine and systems biology tools, capable of identifying and validating human genome sequencing findings for novel risk gene discoveries and targeted therapeutic development in AD. The innovative network-based, artificial intelligence toolboxes and novel risk genes and biologically relevant targeted therapeutic approaches developed in this proposal will prove to be novel and effective ways to improve outcomes in long-term brain care for the rapidly growing AD population, an essential goal of AD precision medicine.
Investigator:
Cruchaga, Carlos
Institution:
Washington University School of Medicine
Project Title:
The Familial Alzheimer Sequencing (FASe) Project
Date of Approval:
January 21, 2026
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
The goal of this study is to identify new genes and mutations that cause or increase risk for Alzheimer disease (AD), as well as protective factors. Individuals and families were selected from the Knight-ADRC (Washington University) and the NIA-LOAD study. Only families with at least three first-degree affected individuals were included. Families with pathogenic variants in the known AD or FTD genes, or in which APOE4 segregated with disease were excluded. At least two cases and one control were selected per family. Cases had an age at onset (AAO) after 65 yo and controls had a larger age at last assessment than the latest AAO within the family. Whole exome (WES) and whole genome sequencing (WGS) was generated for 1,235 individuals (285 families) that together with data from our collaborators and the ADSP family-based cohort (3,449 individuals and 757 families) will provide enough statistical power to identify new genes for AD. Dr. Tanzi (Harvard Medical School) will provide WGS from 400 families from the NIMH Alzheimer disease genetics initiative study. We will perform single variant and gene-based analyses to identify genes and variants that increase risk for disease in AD families. Single variant analysis will consist of a combination of association and segregation analyses. We will run family-based gene-based methods to identify genes that show and overall enrichment of variants in AD cases. We will also look for protective and modifier variants. To do this we will identify families loaded with AD cases, that also include individuals with a high burden of known risk variants but that do not develop the disease (escapees). We will use the sequence data and the family structure to identify variants that segregate with the escapee phenotype. The most promising variants and genes will be replicated in independent datasets (ADSP case-control, ADNI, Knight-ADRC, NIA-LOAD ). We will perform single variant and gene-based analyses to replicate the initial findings, and survival analysis to replicate the protective variants. We will select the most promising variants/genes for functional studies
Non-Technical Research Use Statement:
Family-based approaches led to the identification of disease-causing Alzheimer’s Disease (AD) variants in the genes encoding APP, PSEN1 and PSEN2. The identification of these genes led to the A?-cascade hypothesis and to the development of drugs that target this pathway. Recently, we have identified rare coding variants in TREM2, ABCA7, PLD3 and SORL1 with large effect sizes for risk for AD, confirming that rare coding variants play a role in the etiology of AD. In this proposal, we will identify rare risk and protective alleles using sequence data from families densely affected by AD. We hypothesize that these families are enriched for genetic risk factors. We already have sequence data from 695 families (2,462 individuals), that combined with the ADSP and the NIMH dataset will lead to a dataset of more than 1,042 families (4,684 individuals). Our preliminary results support the flexibility of this approach and strongly suggest that protective and risk variants with large effect size will be found, which will lead to a better understanding of the biology of the disease.
Investigator:
Ebbert, Mark
Institution:
University of Kentucky
Project Title:
Resolving mutations in challenging genomic regions to test association with Alzheimer's disease phenotypes
Date of Approval:
January 21, 2025
Request status:
Expired
Research use statements:
Show statements
Technical Research Use Statement:
A majority of the human genome has been well characterized through the initial Human Genome Project and numerous large-scale sequencing studies such as the 1000 Genomes Project, Alzheimer's Disease Neuroimaging Initiative (ADNI), Alzheimer’s Disease Sequencing Project, and others. There are, however, many genome regions that are challenging to characterize using standard approaches that are important to human health and disease. We intend to (1) develop and test new methods to characterize mutations in these regions, and (2) test associations between these mutations and disease phenotypes. Data from the ADSP may be combined with other datasets, such as the Alzheimer's Disease Neuroimaging Initiative. All appropriate precautions will be taken to verify proper population stratification and eliminate any sample redundancy. Combining these data will not increase risk to participants, as all individual-level data will remain confidential. We may also use portions of the ADSP data as controls for other diseases such as amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD), though only in situations that do not violate genetic or data-use principles. Specifically, data that where participants consented for use only within Alzheimer's disease studies will not be used for any purpose outside Alzheimer's disease research.
Non-Technical Research Use Statement:
Many regions of the human genome present challenges that prohibit scientists from discovering potential disease-causing mutations. We are developing methods to characterize mutations in these regions to identify new genes involved in disease.
Investigator:
Frost, Bess
Institution:
UT Health San Antonio Barshop Institute
Project Title:
Investigating retrotransposon activation and retrotransposon-associated genetic variants associated with human tauopathy
Date of Approval:
October 25, 2022
Request status:
Closed
Research use statements:
Show statements
Technical Research Use Statement:
Objective: To gain insights into retrotransposon activation in specific cell types, our first objective is to analyze differential transposable element expression in bulk sequenced microglia from Alzheimer’s disease patient brain tissue versus controls (NG00105). Study design: Reads will be aligned to the GRCh38 human reference genome with STAR using parameters optimized for aligning transposon derived multi-aligning reads. Read counts for transposon and gene loci will be obtained using TEtranscripts. Differential expression of genes and transposons will then be calculated using Deseq2. Analysis plan: Unsupervised machine learning techniques will be applied to cluster transcription counts by variance to make associations between specific retrotransposons and microglial/immune response associated genes.Objective: We have identified multiple candidate non-reference mobile element insertion variants using nanopore long read sequencing of DNA extracted from frontal cortex of patients at Braak 0, III, and V/VI. Our second objective is to utilize the ADSP umbrella whole genome sequencing dataset (NG00067) to determine if our findings are conserved in a larger cohort of patients with Alzheimer’s disease. Study Design: CRAM alignment files aligned to the GRCh38 reference genome from the ADSP discovery (snd10000) and PSP-UCLA (snd10017) WGS data sets will be analyzed with xTea (Chu et al. 2021) to identify the presence of mobile element insertions previously identified via nanopore. Only genomic regions containing insertions of interest will be analyzed. Analysis Plan: Non-reference mobile insertions identified via nanopore will be compared in control, Alzheimer’s disease, and PSP NIAGADS datasets. Insertions meeting the designated criteria will be considered for a replication analysis using cohorts from the ADSP umbrella dataset. We will determine whether these variants can predict the longitudinal clinical rate of disease progression and correlate with other features such as tau PET positivity, CSF tau, and cognitive testing. We will also consider sex, age, and high-risk genotypes.
Non-Technical Research Use Statement:
Objective 1: Almost half of the human genome is composed of transposable elements, or “jumping genes.” Retrotransposons are activated in human Alzheimer’s disease and related “tauopathies,” as well as in Drosophila and mouse models of tauopathy. In the current study, we will analyze retrotransposon activation specifically in microglia, the immune cells of the brain, in the context of tauopathy. In addition, we will determine if retrotransposons activation correlates with expression of neighboring immune response genes. Objective 2: We have previously identified tau-induced retrotranpsoson activation as driver of neurodegeneration. In a preliminary analysis of Alzheimer’s disease patient samples and controls, we have used long-read whole genome DNA sequencing technology to discover non-reference retrotransposon insertions that are unique to Alzheimer’s disease patients. In the current study, we expand these analyses to determine if our findings are conserved in a larger patient cohort, and how these novel insertions relate to disease progression.
Investigator:
Goate, Alison
Institution:
Icahn School of Medicine at Mount Sinai
Project Title:
Study of Alzheimer's disease and other dementias (e.g. frontotemporal dementia) and related phenotypes
Date of Approval:
March 4, 2025
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
Alzheimer's disease (AD) is the most common form of dementia but has no effective prevention or treatment. Developing a comprehensive picture of the genetic architecture of AD including a network level functional assessment of risk/resilience genes is essential to develop novel therapeutic targets. The overarching goals of this study are to use genetic and genomic approaches to: 1) identify genes and variants that are involved in the development of AD and related disorders; 2) identify functional networks enriched for AD or related disorder risk and protective loci; 3) determine how cellular function and physiology is impacted by these genetic factors in disease-relevant cell types and animal models. This study will use publicly available whole genome/exome sequence data generated by the Alzheimer’s Disease Sequencing Project (ADSP) and genome-wide association study (GWAS) data from the International Genomics of Alzheimer’s Project (IGAP) and others. We will apply a suite of case-control and family approaches to investigate genetic association with dichotomous and continuous disease traits. This study will not only further our understanding of the genetic architecture of AD but also provide key information regarding the molecular mechanisms, setting the stage for novel therapeutic development.
Non-Technical Research Use Statement:
Alzheimer’s disease (AD) is the only disease among the top ten killers in the U.S. without a disease modifying therapy. Genetic studies provide a powerful means to identify genes and pathways that are causally linked to disease etiology. We propose to use genomic and functional approaches to identify genes that alter the risk of AD and investigate how these genes disrupt cellular pathways leading to disease.
Investigator:
Greicius, Michael
Institution:
Stanford University School of Medicine
Project Title:
Examining Genetic Associations in Neurodegenerative Diseases
Date of Approval:
December 19, 2024
Request status:
Expired
Research use statements:
Show statements
Technical Research Use Statement:
We are studying the effects of rare (minor allele frequency < 5%) genetic variants on the risk of developing late-onset Alzheimer’s Disease (AD). We are interested in variants that have a protective effect in subjects who are at an increased genetic risk, or variants that lead to multiple dementias. Our aim is to identify any genetic variants that are present in the “case” group but not the “AD control” groups for both types of variants. The raw data we receive will be annotated to identify SNP locations and frequencies using existing databases such as 1,000 Genomes. We will filter the data based on genetic models such as compounded heterozygosity, recessive and dominant models to identify different types of variants.
Non-Technical Research Use Statement:
Current genetic understanding of Alzheimer’s Disease (AD) does not fully explain its heritability. The APOE4 allele is a well-established risk factor for the development of Alzheimer’s Disease (AD). However, some individuals who carry APOE4 remain cognitively healthy until advanced ages. Additionally, the cause of mixed dementia pathology development in individuals remains largely unexplained. We aim to identify genetic factors associated with these “protected” and mixed pathology phenotypes.
Investigator:
Hohman, Timothy
Institution:
Vanderbilt University Medical Center
Project Title:
Genetic Drivers of Resilience to Alzheimer's Disease
Date of Approval:
January 16, 2026
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
“Asymptomatic” Alzheimer’s disease (AD) is a phenomenon in which 30% of individuals over age 65 meet criteria for autopsy-confirmed pathological AD (beta-amyloid plaques and tau aggregation) but do not clinically manifest cognitive impairment.1-3 The resilience that underlies asymptomatic AD is marked by both protection from neurodegeneration (brain resilience)4 and preserved cognition (cognitive resilience).Our central hypothesis is that genetic effects allow a subset of individuals to endure extensive AD neuropathology without marked brain atrophy or cognitive impairment. We are uniquely positioned to identify resilience genes by leveraging the Resilience from Alzheimer’s Disease (RAD) database, a local resource in which we have harmonized a validated quantitative phenotype of resilience across 8 large AD cohort studies.Our strong interdisciplinary team represents international leaders in genetics, neuroscience, neuropsychology, neuropathology, and psychometrics who will leverage the infrastructure and rich resources of the AD Genetics Consortium, IGAP, ADSP, and our recently established and harmonzed continuous metric of resilience to fulfill the following aims:Aim 1. Identify and replicate common genetic variants that predict cognitive resilience (preserved cognition) and brain resilience (protection from brain atrophy) in the presence of AD pathology. We hypothesize that common genetic variation will explain variance in resilience above and beyond known predictors like education. Replication analyses will leverage age of onset data from IGAP to demonstrate that resilience loci predict a later age of AD onset.Aim 2. Identify and replicate rare and low-frequency genetic variants that predict cognitive and brain resilience. Rare and low-frequency variants with large effects have been identified in AD case/control studies, providing new insight into the genetic architecture of AD.Aim 3: Identify sex-specific genetic drivers of cognitive and brain resilience to AD pathology. Our preliminary results highlight sex differences in the downstream consequences of AD neuropathology, including sex-specific genetic markers of resilience.
Non-Technical Research Use Statement:
As the population ages, late-onset Alzheimer’s disease (AD) is becoming an increasingly important public health issue. Clinical trials targeted a reducing AD progression have demonstrated that patients continue to decline despite therapeutic intervention. Thus, there is a pressing need for new treatments aimed at novel therapeutic targets. A shift in focus from risk to resilience has tremendous potential to have a major public health impact by highlighting mechanisms that naturally counteract the damaging effects of AD neuropathology. The goal of the present project is to characterize genetic factors that protect the brain from the downstream consequences of AD neuropathology. We will identify both rare and common genetic variants using a robust metric of resilience developed and validated by our research team. The identification of such genetic effects will provide novel targets for therapeutic intervention in AD.
Investigator:
Jaffe, Andrew
Institution:
Neumora Therapeutics
Project Title:
Comparisons of pre- and post-mortem microglial populations
Date of Approval:
July 21, 2022
Request status:
Closed
Research use statements:
Show statements
Technical Research Use Statement:
In the study, we propose to directly compare and analyze pre-mortem microglial cells obtained during surgical resection from Young et al [PMID: 34083789] with post-mortem microglia from Lopes et al [PMID: 34992268, Dataset NG00105] to better define the transcriptional landscape of human microglia and the effects of tissue processing. We have previously re-processed and re-analyzed bulk and single cell data from Young et al. to identify expression quantitative trait loci (eQTLs) and develop RNA deconvolution models to partition bulk microglia profiles (like those measured by Dataset NG00105) into cell fractions of 7 important microglial subpopulations/cell states including “homeostatic”, “stress”, and “chemokine/cytokine” using the single cell RNA-seq (scRNA-seq) data from Young et al. We propose to perform this RNA deconvolution in Lopes et al, and test whether any of these cell populations – particularly related to neuroinflammation – are more prevalent in neurodegenerative disorders like Alzheimer’s (AD) or Parkinson’s Diseases (PD). We will also test whether these cell subtype fractions identified in pre-mortem tissue are consistent in postmortem tissue. As validation, we will perform supervised clustering of the NG00108 snRNA-seq data (in mouse) and test whether any AD-associated microglial cell subtypes were enriched in the 5xFAD genotype. Lastly, we propose to combine genotype and RNA data from Lopes et al (NG00105) and Young et al and perform eQTL mega-analysis to double the discovery sample size of microglial eQTLs. We hypothesize that this mega-analysis will produce a much larger number of significant eQTLs, as the GTEx project [PMID: 32913098] found approximately ~3000 eGenes in 100 subject discovery datasets (which was the approximate sample sizes of Young et al and Lopes et al) and ~7000 eGenes in 200 subjects (the combined sample size in this proposal). We will also assess clinical relevance by performing colocalization analysis of this larger eQTL map with genome-wide association studies (GWAS) of neurodegenerative disorders. Overall, this proposal will compare and contrast two recently large-scale genomic efforts profiling human microglia.
Non-Technical Research Use Statement:
Non-technical: This proposal will compare and contrast two recently large-scale genomic efforts profiling human microglia, including from premortem human brain tissue (Young et al, PMID: 34083789) and from postmortem brain tissue (Lopes et al, PMID: 34992268, Dataset: NG00105). We will specifically assess the distribution of various microglial cell states – derived from single cell RNA-seq data – and determine if all of these states are represented in microglia from postmortem tissue. We will perform validation analyses of these cellular states in a mouse model of AD (Dataset: NG00108). Assuming the pre- and post-mortem datasets are comparable, we will combine these datasets and perform joint analysis of genotype and phenotype to better understand variation in microglia gene expression.
Investigator:
Kamboh, M. Ilyas
Institution:
University of Pittsburgh
Project Title:
Genetics of Alzheimer's Disease and Endophenotypes
Date of Approval:
January 7, 2025
Request status:
Expired
Research use statements:
Show statements
Technical Research Use Statement:
Objectives: We are requesting access to the NIAGADS datasets to augment our ongoing studies on the genetics of Alzheimer’s disease (AD) and AD-related endophenotypes being carried out by Kamboh and his group since 1995. We are doing GWAS using array genotypes, whole-exome sequencing and whole-genome sequencing on datasets derived from University of Pittsburgh ADRC and ancillary population-based longitudinal studies on dementia and biomarkers. Different available phenotypes include AD and non-AD dementia, age-at-set, disease progression and survival, neuroimaging, cognitive decline, plasma biomarkers for the core ATN and non-ATN pathologies. We also plan to expand on gene-gene interaction and sex-stratified analyses which require the actual genotype data. The NIAGADS datasets will be used for replication and meta-analysis, and for gene-gene interaction and sex-stratified analyses. Study Design: A case-control design will incorporate a diverse cohort of individuals with AD and age-matched controls. For quantitative traits (neuroimaging and plasma biomarkers, cognitive performance measures, indicators of disease progression), linear regression analyses will be performed to identify genetic loci. To ensure the findings are robust and inclusive, participants from diverse demographic backgrounds will be included, enabling the exploration of potential genetic variations across populations. Analysis Plan: We will conduct GWAS and targeted analyses on candidate genes on different AD and AD-related phenotypes. Primary phenotypic variables include AD disease status, age-at-onset, last age for controls, APOE genotype, cognitive decline trajectories, sex, and race. Analyses will evaluate the influence of specific genetic variants on disease risk, cognitive performance, and biomarker levels, considering both individual and interactive effects of the APOE genotype. Results will be adjusted for potential confounders, such as demographic factors, to ensure valid associations. Detail analytical methods are described in our published papers for case-control (PMID: 32651314;35694926), quantitative traits (PMID: 30361487;37666928), and cognitive decline (PMID: 37089073; 30954325).
Non-Technical Research Use Statement:
Our research group at the University of Pittsburgh (Pitt), has been working on the genetics of Alzheimer’s disease (AD) and AD-related endophenotypes for almost three decades, on data derived largely from the University of Pittsburgh Alzheimer’s Disease Research Center and ancillary dementia studies. We are requesting access to the NIAGADS genotype and phenotype datasets to augment our sample size to increase power to detect novel genetic associations with AD and related endophenotypes.
Investigator:
Konermann, Silvana
Institution:
Arc institute
Project Title:
Modeling Alzheimer’s disease risk and associated molecular phenotypes
Date of Approval:
August 8, 2025
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
The objective of the proposed research is to determine the relationship between Alzheimer’s disease (AD) genetic risk and associated molecular phenotypes. Genotype data will be used to compute a polygenic risk score (PRS) for disease-affected and control (non-disease-affected) participants. Statistical regression and mediation analyses will be used to model variation of molecular phenotypes with respect to PRS and, where available, pathology stage or cognitive impairment. Molecular phenotypes to be analyzed include bulk/single-cell/single-nucleus transcriptome, epigenome, proteome, metabolome, lipidome, amyloid, and tau. Molecular phenotypes of participants, including controls, will be matched with molecular phenotypes of in vitro cellular models, informing the design of in vitro perturbation experiments that recapitulate the genetic drivers of AD risk.
Non-Technical Research Use Statement:
Our goal is to determine the relationship between human genetic profiles associated with Alzheimer’s disease (AD) risk and specific measurable characteristics of human cells. Using multiple statistical analysis methods, we will build quantitative models that describe how those characteristics vary as a function of AD genetic risk. The models we build will help us design in vitro cellular systems that reflect different levels of AD risk, enabling experiments that inform new strategies for treating or preventing AD.
Investigator:
Lee, Jonghun
Institution:
TAKEDA PHARMACEUTICAL COMPANY LTD
Project Title:
Identification of genetic risks and potential target for stratified Alzheimer's disease patient groups
Date of Approval:
October 24, 2025
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
The goal of analyzing ADSP umbrella cohort data is identifying variants, genes and pathways associated to Alzheimer’s disease (AD), and stratifying patients by genetic risks. Following describes procedure.1) Identification and validation of genetic risks The whole genome and whole exome sequencing data will be analyzed to identify genetic variants or genes associated to phenotypes in case-control cohort, such as AD status and Braak stages. Several methods will be applied, such as VEP [William McLaren et al, 2016], LOFTEE [Karczewski, 2015] and PEXT scoring [Beryl B.C et al., 2020] for variant annotation and SAIGE-GENE [Wei Zhou et al., 2020], and KGWAS [Kexin Huang 2024] for the association test. The association will be tested for other endophenotypes such as cognitive scores and brain volumes that available in subset of the cohort. Replication and meta-analysis will be conducted on UK biobank and Tohoku medical megabank organization (ToMMo) cohort data. The ToMMo data consists of Japanese cohort so that we can analyze the effect of the variants among multi ethnic groups.2) Patient stratification in ADSP cohort Leveraging the increased sample size, we will stratify the cohort by genetic risks such as ApoE types, or phenotypes such as Braak stages, and compare the effect size of variants or genes among the patient groups. In addition, the genetic risk score (GRS) will be calculated using LDpred2 [Florian Prive, 2020], RapidoPGS [Guillermo Reales, 2020], and PRSice2 [Choi, S.W., 2020], and validated in independent cohorts and compared to available clinical endophenotypes. Then we will search the effect of the GRS to extensive phenotypes in UK biobank and ToMMo. Last, the NG00130 proteome will be used for unsupervised classification of patients. Overlapping the protein signature-based groups with genetic signals, we’ll find casual pathologic pathways and targets for each subgroups.
Non-Technical Research Use Statement:
The aim of our study is identifying variants or genes potentially causal of the Alzheimer’s disease in whole or subset of patients. To be specific, WES and WGS data will be analyzed to investigate common and rare variants associated with disease status and intermediate phenotypes. In addition, the patients will be stratified and sub-grouped by their genetic and proteomic signatures. Last, we will incorporate other large biobanks such as UK biobank or ToMMo to investigate the genetic effects to extensive phenotypes potentially linked to symptoms appearing in sub patient groups.
Investigator:
Li, Qingqin
Institution:
Janssen Research & Development, LLC
Project Title:
Target identification and validation in Alzheimer’s Disease with Whole-Genome and Whole-Exome Sequence Data
Date of Approval:
March 31, 2023
Request status:
Closed
Research use statements:
Show statements
Technical Research Use Statement:
Aim 1: Identify novel genes and replicate existing gene associations for Alzheimer’s disease (AD). Aim 1a: Common variant genome-wide association analysis. With this approach, we will leverage existing consortium GWAS summary statistics where makes sense (or request leave-one/N summary association statistics out if we see a need to use a different version of phenotype definition from the same cohort) and augment them with additional datasets available internally. Aim 1b: Rare variant gene-level genetic burden analysis. Using the ADSP analysis pipeline, we will aim to use the same analysis pipeline (but reserve the option to use an alternative pipeline) to contribute the whole genome sequencing (WGS) data generated from the internal galantamine samples to ADSP-led consortium analysis. We will perform case-control and/or family-based genetic analyses and/or quantitative trait genetic analyses using AD traits such as diagnosis, age of onset, amyloid positivity, tau positivity, CSF biomarker endophenotypes, disease progression, etc. (where the phenotype is available) as the outcome of interest. Covariates include age, sex, and principal components. ADSP, UKB, and FinnGen will be analyzed separately and combined with a meta-analysis. Biobank cases will be defined using ICD-9/ICD-10 codes, and proxy cases and controls will be carefully defined using questionnaire data on the parental history of AD. Both true and proxy cases will be considered to maximize the number of AD cases. Aim 2: Prioritize novel gene associations identified in Aim 1. We will perform genetic fine-mapping and leverage tissue and cell-type specific datasets (e.g. GTEx, AD Knowledge Portal including AMP-AD, internal datasets, MiGA, Harari et al snRNA-Seq) to prioritize targets for further functional and analytical interrogation. Furthermore, multi-omics-based network approaches will be used to identify disease-related molecular modules and tissue-specific regulatory circuits. Aim 3: utilize single-nuclei sequencing data to more fully catalog cell type heterogeneity in the brains of individuals with AD and how this differs from brain from uninjured, cognitively unimpaired individuals.
Non-Technical Research Use Statement:
Alzheimer’s disease (AD) is a common, progressive, neurodegenerative disorder with a strong genetic component with heritability estimates ranging from 58–79% for late-onset AD and over 90% for early-onset AD. To date, there is only one approved treatment option intended to mediate the disease progression of AD, while all others treat symptoms associated with AD. Genetic association studies are important to highlight key biological mechanisms contributing to the etiology of AD and provide insights into potential pathways that can ultimately be targeted for future therapeutic development. The aim of this study is to perform a retrospective analysis of genetic data collected from large-scale population-based and case-control cohorts including the UK Biobank, the Alzheimer’s Disease Sequencing Project (ADSP), FinnGen, and Janssen internal cohorts. We will also integrate them with available multi-modal datasets including but not limited to, Microglia Genomic Atlas, Harari et al snRNA-Seq, and neuroimaging data to identify novel and existing evidence for genetic determinants of AD.
Investigator:
Malkova, Anna
Institution:
University of Iowa
Project Title:
Micro-homology Templated Insertions in Alzheimer's Disease
Date of Approval:
May 8, 2024
Request status:
Expired
Research use statements:
Show statements
Technical Research Use Statement:
The objective of our research is to characterize genomic rearrangements associated with various human disease including Alzheimer’s. The overarching hypothesis guiding our research is that repair of DNA double-strand breaks (DSBs) by using ‘risky’ inaccurate pathways can lead to genomic destabilization. Our focus is on two DSB repair pathways: break-induced replication (BIR) and microhomology-mediated BIR (MMBIR). BIR is initiated by a broken DNA end invading into a homologous template followed by extensive DNA synthesis that is highly mutagenic. Interruptions of BIR leads to initiation of MMBIR, a template-switching event that often leads to complex genomic rearrangements and has been linked to neurological conditions and to cancer. The overall goal of our proposed research is to define the molecular mechanisms of MMBIR, and to identify factors that inhibit or promote cells entering into MMBIR.We aim to achieve this using our MMBSearch tool to detect MMBIR events that are often missed by other methods in human WGS analyses. Using MMBSearch we will analyze data from NIAGADS, specifically data on neurological disease associated whole genome sequencing (WGS) and whole exome sequencing (WES) to detect MMBIR events associated with neurodegenerative disorders.The results of this analyses will be used to determine the frequency of MMBIR in various types of human cells and their association with neurodegenerative disorders. In addition, we will identify chromosomal locations where MMBIR events are especially abundant and specific features in humans that predispose them to MMBIR. We will identify genetic variations predisposing cells to MMBIR, which may uncover that specific SNPs, structural variations, certain gene mutations, etc. are associated with MMBIR events. We specifically hypothesize that mutations in DNA repair, DNA replication, chromatin maintenance, and DNA damage checkpoint genes could promote MMBIR. These studies will shed light on the etiology and mechanism of MMBIR to potentially develop biomarkers for early detection and design targeted therapies to treat human disorders.
Non-Technical Research Use Statement:
The goal of our research is to understand the underlying mechanisms of genomic instability that lead to human disease. In particular, we are interested to investigate the molecular mechanism of an essentially uncharacterized DNA repair pathway, microhomology-mediated break-induced replication (MMBIR) that has been implicated in DNA mutations and found in a variety of human cancers and in association with neurological diseases. We have recently described a diagnostic pattern of mutations associated with MMBIR using a yeast model, which has allowed us to develop a novel algorithm to search for MMBIR events in sequenced human genomes. We are planning to apply this new algorithm to identify MMBIR events in analyzing human genome databases. The proposed research will allow us to further understand mechanisms of leading to various human diseases including cancer and neurological human diseases and to refine our software that is aimed to detect MMBIR in human genomes. The proposed research will be focused on analyzing the data from NIAGADS database.
Investigator:
Masters, Colin
Institution:
The Florey Institute, The University of Melbourne
Project Title:
The Australian Imaging Biomarkers and Lifestyle (AIBL) Flagship Study of Ageing: Detecting and Preventing Alzheimer’s disease: Towards Lifestyle Interventions-Somatic mutation in Alzheimer's Disease
Date of Approval:
May 15, 2024
Request status:
Expired
Research use statements:
Show statements
Technical Research Use Statement:
Project Title: The Australian Imaging Biomarkers and Lifestyle (AIBL) Flagship Study of Ageing: Detecting and Preventing Alzheimer’s disease: Towards Lifestyle Interventions - Somatic mutation in Alzheimer's Disease (sub-project)Objectives -- Somatic Mutation in AD is a project to identify non-congenitally acquired genetic risks associated with disease onset of sporadic Alzheimer’s disease (AD). Somatic mutation can be any form of alteration in DNA that occur after conception. As opposed to congenital, it’s generally not hereditary unless the germ cells are involved. These alterations can (but do not always) cause disease. We aim to identify somatic variants that contribute to sporadic AD. We believe that the detection of somatic mutations can overcome the flaws of the large genome-wide multiple testing and increase the signal-to-noise ratio to pinpoint the rare genetic determinants that were largely neglected by current genetic association studies.Study design -- We have collected 20 paired human brain microglial DNAs (treated as “tumour”) and whole blood DNAs (treated as “normal”) to call somatic mutations by a tumour-normal mode using a software, MuTect2 (Broad Institute). The sequence has been obtained from the whole genome. Hundreds of rare genetic variants have been identified to connect with AD.Analysis plan -- We’d like to validate our results using datasets like NG00067, NG00105 and NG00106. However, it’s ideal if we could access the alignment data (i.e., BAM files) as well. Because technically somatic calling is not simply a difference between normal (germline) and reference; but also calls for tumour against normal (germline) alongside alignment. MuTect2 is developed to identify somatic mutations. It works with or without matching normal. Once we get access to the alignment data, we will reprocess all samples using the MuTect2 without matching the normal pipeline. We'll call somatic mutations using those datasets and validate the rare genetic determinants that contribute to sporadic AD.
Non-Technical Research Use Statement:
Somatic Mutation in Alzheimer's disease is a project to identify non-congenitally acquired genetic risks associated disease onset of a sporadic Alzheimer’s disease (AD). We believe that detection of somatic mutations can pinpoint the rare genetic determinants that were largely neglected by current genetic association studies. In our pilot study, we have identified hundreds of rare genetic mutations that are strongly associated with AD. We'd like to validate our results using an independent cohort. We plan to reprocess NIH datasets using our own pipeline. But we would need to access the raw data rather than the processed data. This research will greatly accelerate the research on the molecular genetics of AD.
Investigator:
Pendergrass, Rion
Institution:
Genentech
Project Title:
Genetic Analyses Using Data from MiGA and related studies
Date of Approval:
October 24, 2025
Request status:
Expired
Research use statements:
Show statements
Technical Research Use Statement:
The purpose of our study is to identify novel genetic factors associated with age related neurodegeneration. This includes identifying genetic factors associated with the risk of these conditions, as well as genetic risk factors associated with age-at-onset (AAO) for these conditions. The findings from our analyses have the potential for identification of new therapeutic targets for Alzheimer's Disease and other age related neurodegenerative disease. The findings from our analyses also have the potential for identification of genetic and phenotypic biomarkers that will be beneficial for subsetting patients in new ways. Using the data we have requested we will be identifying genes driving neurodegenerative diseases by identifying dysregulated genes in cases through using total and allele specific gene expression profiles.Genotypes and RNA-seq reads will be used to generate allele specific expression (ASE). RNA-seq counts and ASE from controls will be used to model the variance of both total and ASE gene expression. Total gene expression vs ASE specifically from cases will be used to identify dysregulated genes in single individuals. These will then be compared to pathway and known disease-associated genes. Case/control status, genotype, and RNA-seq data will be all be evaluated together through quantitative trait loci (QTL) analyses, and additional statistical association analyses.All data will remain anonymized and securely stored, and only those listed on our application and their staff will have access to these data. We will not share any of the individual level data outside of Genentech nor beyond the researchers on our application. We will adhere to all data use agreement stipulations through the NIAGADS. We have a secure computational environment called Rosalind within Genentech where we will use these data. We have IT security staff that constantly monitor all our research computing, assuring safety and privacy of all of our stored data. We will not collaborate with researchers at other institutions.
Non-Technical Research Use Statement:
Genetic variation and gene expression data allows us to understand more of the genetic contribution to risk and protection from diseases such as Alzheimer’s and dementia. This information also allows us to identify important biological contributors to disease for developing effective treatment strategies, and identifying groups of individuals that would benefit most from new treatments. Our exploration of this relationship between genotype, disease traits, gene expression, and outcomes, through these datasets will allow us to pursue important new findings for disease treatment.
Investigator:
Ratnapriya, Rinki
Institution:
Baylor College of Medicine
Project Title:
Microglia-associated expression quantitative trait loci (eQTLs) and causal variants relevant to Age-related Macular Degeneration (AMD)
Date of Approval:
July 15, 2025
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
The objectives of our genomics study is to analyze the eQTLs from microglia (resident immune cells) and identification of causal variants in neuro-degenerative disease. For our study, we will perform the robust eQTL analysis by integrating all available microglia genotyping and gene expression datasets of human. For this purpose we will include the normal phenotype microglia genotype and gene expression datasets. We will investigate the identified microglia eQTL for their association in neuro-degenerative disease, i.e. age-related macular degeneration (AMD). We will run the eQTL detection software on genotype and gene expression data of normal healthy phenotype. After selecting the conditionally significant association eQTLs, we will test the pleiotropic association between the expression level of a gene and a complex trait of interest using summary-level data from AMD GWAS and analyzed eQTLs. We will also perform the genetic colocalisation analysis of potentially related phenotypes for sharing the common genetic causal variant(s) in a given region.
Non-Technical Research Use Statement:
Dysregulated immune function and neuroinflammation have become recognized as common underlying mechanisms in aging and various neurodegenerative diseases, including AMD. Microglia serve as the resident immune cells in both the brain and retina, akin to macrophages, and have recently been identified as significant contributors to AMD pathogenesis. Our objective is to access brain microglia data to investigate microglia-associated expression quantitative trait loci (eQTLs) and causal variants relevant to AMD. We plan to utilize existing genotype and gene expression data from microglia to perform eQTL analysis.
Investigator:
Roussos, Panagiotis
Institution:
Icahn School of Medicine at Mount Sinai
Project Title:
Higher Order Chromatin and Genetic Risk for Alzheimer's Disease
Date of Approval:
November 21, 2024
Request status:
Expired
Research use statements:
Show statements
Technical Research Use Statement:
Alzheimer's disease (AD) is the most common form of dementia and is characterized by cognitive impairment and progressive neurodegeneration. Genome-wide association studies of AD have identified more than 70 risk loci; however, a major challenge in the field is that the majority of these risk factors are harbored within non-coding regions where their impact on AD pathogenesis has been difficult to establish. Therefore, the molecular basis of AD development and progression remains elusive and, so far, reliable treatments have not been found. The overarching goal of this proposal is to examine and validate AD-related changes on chromatin accessibility and the 3D genome at the single cell level. Based on recent data from our group and others, we hypothesize that genotype-phenotype associations in AD are causally mediated by cell type-specific alterations in the regulatory mechanisms of gene expression. To test our hypothesis, we propose the following Specific Aims: (1) perform multimodal (i.e., within cell) profiling of the chromatin accessibility and transcriptome at the single cell level to identify cell type-specific AD-related changes on the 3D genome; (2) fine-map AD risk loci to identify causal variants, regulatory regions and genes; (3) functionally validate putative causal variants and regulatory sequences using novel approaches that combine massively parallel reporter assays, CRISPR and single cell assays in neurons and microglia derived from induced pluripotent stem cells; and (4) develop and maintain a community workspace that provides for the rapid dissemination and open evaluation of data, analyses, and outcomes. Overall, our multidisciplinary computational and experimental approach will provide a compendium of functionally and causally validated AD risk loci that has the potential to lead to new insights and avenues for therapeutic development.
Non-Technical Research Use Statement:
Alzheimer’s disease (AD) affects half the US population over the age of 85 and despite decades of research, reliable treatments for AD have not been found. The overarching goal of our proposal is to generate multiscale genomics (gene expression and epigenome regulation) data at the single cell level and perform fine mapping to detect and validate causal variants, transcripts and regulatory sequences in AD. The proposed work will bridge the gap in understanding the link among the effects of risk variants on enhancer activity and transcript expression, thus illuminating AD molecular mechanisms and providing new targets for future therapeutic development.
Investigator:
Rychkova, Anna
Institution:
Alector
Project Title:
Genetic analysis of Alzheimer’s disease risk factors
Date of Approval:
October 12, 2023
Request status:
Closed
Research use statements:
Show statements
Technical Research Use Statement:
At Alector we are focused on developing antibody-based therapies for cancer and neurodegenerative disorders such as Alzheimer's disease. Our main therapeutic hypotheses are: that the immune system plays a critical role in neurodegenerative diseases, and that redirecting aberrant immune cell activity in the brain could improve healthy function. We thus are very interested in untangling the role of microglia in Alzheimer’s disease, and understanding the underlying biological pathways.Large GWAS studies of Alzheimer's disease (AD) uncovered a number of loci that are associated with the disease, however the mechanism of their involvement in the pathology is largely unknown. To better understand the role of various AD associated SNPs, we are looking for large datasets with both genotype and transcriptomics data in various cell types, and the Microglia Genomic Atlas study (MiGA) is an excellent resource of such data for microglia.With this data in hand we plan to perform the following analysis: We will query for a linear relationship between AD risk factors (risk allele loads) and mRNA levels to identify transcriptional signatures associated with each SNP. This analysis will be conducted using plink and R, correcting for covariates, such as gender, age, and population structure. We will follow with functional annotation using gene set enrichment analysis to further characterize impact of risk factors. In addition, we are performing similar analysis of various other cell types (monocytes, macrophages, neurons). By doing comparative analysis we are looking to identify cell type specific mechanisms that might be involved in the disease pathology.Overall, mining data from MiGA and other datasets will help us better understand the mechanism of action of risk factors of AD, and aid Alector with biomarker selection strategy, as well as antibody screening.
Non-Technical Research Use Statement:
Understanding the role of myeloid immune cells in neurodegenerative disorders and cancer is central to Alector. Large datasets of samples from patients with Alzheimer's disease and healthy controls are an invaluable resource for scientists striving to understand the biological mechanisms leading to disease and find ways to cure it. The Microglia Genomic Atlas study is one of the rare resources of a large number of microglia samples with both gene expression and genetic variation data. By performing statistical analysis of this dataset in combination with data from other cell types we will gain better understanding into mechanisms of action of Alzheimer's disease’s risk factors, and help Alector with developing treatment for patients.
Investigator:
Salas Diaz, Lucas
Institution:
Dartmouth College
Project Title:
Human fetal derived microglia display profound age-related changes in epigenetic and transcriptomic features
Date of Approval:
November 25, 2024
Request status:
Expired
Research use statements:
Show statements
Technical Research Use Statement:
Microglia (MG) are the principal immune cells of the central nervous system, constituting 10% of all cells in the brain. MG cells perform many critical functions involved in normal CNS homeostasis and in neuroinflammation and degenerative diseases. MG cells are derived from embryonic yolk sac progenitors and persist through self-renewal into adulthood. Their behavior and dynamics evolve significantly with aging. As individuals age, microglia undergo profound phenotypic changes. The lack of markers to differentiate between monocyte-derived and resident microglia makes it difficult to assess age-related ontogenetic shifts. Research shows that microglia can be differentiated by unique DNA methylation patterns, forming a "memory trace" of their fetal state, quantified through a fetal cell origin (FCO) score. The FCO score helps determine the epigenetic age of microglia. While generally high in younger individuals, indicating fetal origin, FCO scores decrease significantly in those over 60, suggesting epigenetic remodeling with aging. We hypothesize that aging leads to a large-scale ontogenetic shift in microglial populations, alongside changes in DNA methylation and transcriptomic features. To investigate, microglia samples have been sequenced using RNA-seq, focusing on: 1. DeWitte microglia show strong FCO variation with age; 2. DeWitte also did RNA-Seq data on N=50 of the same samples; 3. Obtain RNA-Seq information from paired samples, then compare expression in samples stratified by high/low FCO or by age (<60 vs >60); 4. Obtain DEGs for FCO high cells, rank all DEGs by logFC high to low expression; 5. Determine the genes in the FCO that discriminate fetal and adult stem cells; 6. Use the fetal gene list and the ranked DEGs from the high/low FCO analysis to perform a gene enrichment analysis. Ask if fetal genes are enriched in the leading edge of overexpressed genes in the FCO high/low ranked list; 7. Determine if fetal genes are enriched in differentially expressed microglial genes; 8. Explore specific gene transcription methylation correlations.
Non-Technical Research Use Statement:
Microglia (MG) are the principal immune cells of the central nervous system, constituting 10% of all cells in the brain. Aging leads to profound changes in microglial phenotypes and the lack of distinguishing markers for monocyte-derived versus resident MG has made it impossible to discern if normal aging leads to age-related ontogenetic shifts in MG populations. The ontogeny of fetal versus adult stem cell-derived populations can be traced to DNA methylation marks using the fetal cell origin (FCO) score. Using published data, we computed the FCO scores of isolated human MG from different brain regions and subjects of varying ages. The results showed the FCO scores in MGs were highly age-dependent. Microglia from older donors demonstrated significantly lower FCO scores, indicating an epigenetic shift or remodeling in the aging CNS population. Here, we hypothesize that aging is associated with a large-scale ontogenetic shift in the MG populations and that this shift will be accompanied by characteristic changes in DNA methylation signatures and associated transcriptomic features.
Investigator:
Seshadri, Sudha
Institution:
Glenn Biggs Institute for Alzheimer's and Neurodegenerative Diseases, University of Texas Health Sciences Center, San Antonio, TX
Project Title:
Therapeutic target discovery in ADSP data via comprehensive whole-genome analysis incorporating ethnic diversity and systems approaches
Date of Approval:
August 12, 2025
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
Objective: Utilize ADSP data sets to identify genes & specific genetic variants that confer risk for or protection from Alzheimer disease. Aim 1: Using combined WGS/WES across the ADSP Discovery, Disc-Ext, and FUS Phases, including single nucleotide variants, small insertion/deletions, and structural variants. We will: Aim 1a. Perform whole genome single variant and rare variant case/control association analyses of AD using ADSP and other available data; Aim 1b. Target protective variant identification via association analysis using selected controls within the ADSP data and performing meta analysis across association results based on selected controls from non-ADSP data sets. Aim 1c. Perform endophenotype analyses including cognitive function measures, hippocampal volume and circulation beta-amyloid ADSP data in subjects for which these measures are available. Meta analysis will be conducted across ADSP and non-ADSP analysis results. Aim 2: To leverage ethnically-diverse and admixed populations to identify AD variants we will: Aim 2a. Estimate and account for global and local ancestry in all analyses; Aim 2b. Perform admixture mapping in samples of admixed ancestry; and Aim 2c. Perform ethnicity-specific and trans-ethnic meta-analyses. Aim 3: To identify putative therapeutic targets through functional characterization of genes and networks via bioinformatics, integrative ‘omics analyses. We will: Aim 3a. Annotate variants with their functional consequences using bioinformatic tools and publicly available “omics” data. Aim 3b. Prioritize results, group variants with shared function, and identify key genes functionally related to AD via weighted association analyses and network approaches. Analyses will be performed in coordination with the following PIs. Coordination will involve sharing expertise, analysis plans or analysis results. No individual level data will be shared across institutions. Philip De Jager, Columbia University; Eric Boerwinkle & Myriam Fornage, U of Texas Health Science Center, Houston; Sudha Seshadri, U of Texas, San Antonio; Ellen Wijsman, U of Washington. William Salerno, Baylor College of Medicine
Non-Technical Research Use Statement:
This proposal seeks to analyze existing genetic sequencing data generated as part of the Alzheimer’s Disease Sequencing Project (ADSP) including the ADSP Follow-up Study (FUS) with the goal of identifying genes and specific changes within those genes that either confer risk for Alzheimer’s Disease or provide protection from Alzheimer’s Disease. Analytic challenges include analysis of whole genome sequencing data, appropriately accounting for population structure across European ancestry, Hispanic, and African American participants, and interpreting results in the context of other genomic data available.
Investigator:
Wainberg, Michael
Institution:
Sinai Health System
Project Title:
Uncovering the causal genetic variants, genes and cell types underlying brain disorders
Date of Approval:
February 3, 2026
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
We propose a multifaceted approach to elucidate and interpret genetic risk factors for Alzheimer's disease. First, we propose to perform a whole-genome sequencing meta-analysis of the Alzheimer's Disease Sequencing Project with the UK Biobank and All of Us to associate rare coding and non-coding variants with Alzheimer's disease and related dementias. We will explore a variety of case definitions in the UK Biobank and All of Us, including those based on ICD codes from electronic medical records (inpatient, primary care and/or death), self-report of Alzheimer's disease or Alzheimer's disease and related dementias, and/or family history of Alzheimer's disease or Alzheimer's disease and related dementias. We will perform single-variant, coding-variant burden, and non-coding variant burden tests using the REGENIE genome-wide association study toolkit.Second, we propose to develop statistical and machine learning models that can effectively infer (“fine-map”) the causal gene(s), variant(s), and cell type(s) underlying each association we find, as well as associations from existing genome-wide association studies and other Alzheimer's- and aging-related cohorts found in NIAGADS. In particular, we propose to improve causal gene identification by incorporating knowledge of gene function as a complement to functional genomics. For instance, we plan to develop improved methods for inferring biological networks, particularly from single-cell data, and integrate these networks with the results of the non-coding associations from our first aim to fine-map causal genes. To fine-map causal variants and cell types, we plan to integrate the associations from our first aim with single-nucleus chromatin accessibility data from postmortem brain cohorts to simultaneously infer which variant(s) are causal for each discovered locus and which cell type(s) they act through.
Non-Technical Research Use Statement:
We have a comprehensive plan to understand and explain the genetic factors that contribute to Alzheimer's disease. Our approach involves two main steps.First, we'll analyze genetic information from large research databases to identify rare genetic changes associated with Alzheimer's and related memory disorders. We'll look at both specific changes in genes and other parts of the genetic code. We'll use data from different studies and combine them to get a clearer picture.Second, we'll create advanced computer models that can help us figure out which specific genes, genetic changes, and cell types are responsible for these associations. This will help us pinpoint the most important factors contributing to Alzheimer's disease. We'll also analyze data from previous studies to build a more complete understanding of these genetic links.
Investigator:
Yang, Jingjing
Institution:
Emory University
Project Title:
Novel statistical methods for integrating transcriptomic and proteomic data in GWAS
Date of Approval:
December 2, 2025
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
The objective of the proposed project is to derive novel statistical methods to integrate multi-omics data and pathology data in genome-wide association studies (GWAS) for studying complex phenotypes, with the goal of prioritizing genetic variants and identifying causal genes. First, we will develop novel statistical methods to integrate summary-level omics data and pathology data of diverse populations with GWAS data to prioritize risk genes. Second, we will apply our tools to publicly available xQTL data and the ADSP GWAS data. Third, we will also use the ADSP GWAS summary data to conduct causal analysis of other aging-related phenotypes and AD dementia.We will first develop novel statistical methods to integrate summary-level xQTL data of multiple populations with GWAS data to test gene associations with complex human diseases. We are interested in studying all complex phenotypes that were profiled for the ADSP samples, especially Alzheimer’s disease (AD) and AD-related complex phenotypes. Especially, our lab has access to the ROS/MAP multi-omics data shared by the Rush Alzheimer’s disease center (http://www.radc.rush.edu/), and GTEx data. All samples in the ROS/MAP study are well-characterized with extensive complex phenotypes profiled, including clinical diagnosis of AD, AD-related complex phenotypes, and psychological phenotypes. GTEx provides transcriptomic data of multiple human tissues. We will leverage multiple omics data profiled from the ROS/MAP study and transcriptomics data profiled from GTEx to learn SNP-omics relations, and then integrate such learned relationships with ADSP data to identify risk genes of complex diseases. We will also validate our findings by using omics and pathology data in the requested data sets.The purpose of using ADSP data is to increase sample size for testing our derived methods for functional genetic association studies of complex phenotypes, studying the genetic etiology of AD and AD-related phenotypes, and validating our finding by using the omics data from Rush Alzheimer's Disease Center. We are not limited to studying AD only. We are flexible to study any complex phenotypes that are profiled for ADSP samples.
Non-Technical Research Use Statement:
This proposed project is to develop novel statistical methods to integrate summary-level multi-omics data such as transcriptomic, proteomics, and epigenetics, and pathology data, in genome-wide association studies (GWAS) of complex phenotypes, with the goal of identifying causal genes. i) We will develop novel statistical method for integrating summary-level omics data and pathology data with GWAS data. ii) We will apply our tools to publicly available summary-level omics data, omics data from the ROS/MAP study, and ADSP GWAS data for studying AD and AD-related phenotypes. iii) We will conduct causal inference to test the causal relationship between AD and other aging-related phenotypes. We propose to test our proposed methods on the applied genomic analysis data to study complex phenotypes that are profiled for ADSP, including AD, AD-related pathology traits, and related psychological disorders.
Investigator:
Zhao, Zhongming
Institution:
University of Texas Health Science Center at Houston
Project Title:
AIM-AI: an Actionable, Integrated and Multiscale genetic map of Alzheimer's disease via deep learning
Date of Approval:
March 27, 2025
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
Objectives: The objective of our study is to advance our understanding of the genetic basis of Alzheimer’s Disease (AD) through the analysis of comprehensive genomic datasets such as Whole Exome Sequencing (WES), Whole Genome Sequencing (WGS), single-nuclei RNA sequencing, and Genome-Wide Association Studies (GWAS), as well as the related phenotype. We aim to identify genetic variants that are integral to the development and progression of AD.Study Design: Our approach involves a detailed multi-omics analysis focusing on both coding and non-coding regions within these datasets. We will develop new analytical variables from existing data, ensuring that our research adheres to the established data use limitations and contributes meaningfully to the field of genetic research in AD.Analysis Plan: The plan centers on investigating the correlation between genetic variants and AD, exploring how these variants influence the disease at a genetic level. We will employ cutting-edge computational methods to analyze interactions between these genetic markers and their potential role in AD pathogenesis. The integration of data from multiple sources will be carefully executed to maintain compliance with data use agreements, emphasizing the scientific exploration of AD.
Non-Technical Research Use Statement:
Our research is dedicated to unraveling the genetic components of Alzheimer’s Disease. By analyzing genetic sequences and variations through various genomic datasets, we seek to deepen the scientific understanding of how these genetic elements contribute to AD. The outcomes of this study will be shared with the public, enhancing general knowledge of Alzheimer’s Disease and supporting the global research community in its ongoing efforts to decode this complex condition.
Investigator:
Zhi, Degui
Institution:
University of Texas Health Science Center at Houston
Project Title:
Genetics of deep-learning-derived neuroimaging endophenotypes for Alzheimer's Disease
Date of Approval:
February 6, 2025
Request status:
Expired
Research use statements:
Show statements
Technical Research Use Statement:
Alzheimer’s disease (AD) affects 5.6 million Americans over the age of 65 and exacts tremendous and increasing demands on patients, caregivers, and healthcare resources. Our current understanding of the biology and pathophysiology of AD is still limited, hindering advances in the development of therapeutic and preventive strategies. Existing genetic studies of AD have some success but these explain only a fraction of the overall disease risk, suggesting opportunities for additional discoveries. The proposed project will leverage existing neuroimaging and genetic data resources from the UK Biobank, the Alzheimer’s Disease Sequencing Project (ADSP), the Alzheimer’s Disease Neuroimaging Initiative (ADNI), and the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium, and will be conducted by a multidisciplinary team of investigators. We will derive AD endophenotypes from neuroimaging data in the UK Biobank using deep learning (DL). We will identify novel genetic loci associated with DL-derived imaging endophenotypes and optimize the co-heritability of these endophenotypes with AD-related phenotypes using UK Biobank genetic data. We will leverage resources and collaborations with AD Consortia and the power of DL-derived neuroimaging endophenotypes to identify novel genes for Alzheimer’s Disease and AD-related traits. Also, we will develop DL-based neuroimaging harmonization and imputation methods and distribute implementation software to the research community. We expect to discover new genes relevant to AD which may leads to understanding of molecular basis of AD and potential new treatment.
Non-Technical Research Use Statement:
Alzheimer’s disease (AD) exacts a tremendous burden on patients, caregivers, and healthcare resources. Our current understanding of the biology of AD is still limited, hindering advances in the development of treatment and prevention. Existing genetic studies of AD have some success but more studies are needed. The proposed project will leverage existing neuroimaging and genetic data resources from the UK Biobank, the Alzheimer’s Disease Sequencing Project (ADSP) and other consortia and will be conducted by a multidisciplinary team of investigators. We will derive new AD relevant intermediate phenotypes from neuroimaging data using deep learning (DL), an AI approach. We will identify novel genetic loci associated with these phenotypes. Also, we will develop imaging harmonization and imputation methods and distribute implementation software to the research community. We expect to discover new genes relevant to AD which may leads to understanding of molecular basis of AD and potential new treatment.
Investigator:
Zhou, Weichen
Institution:
University of Michigan
Project Title:
Explore the functional impact of transposable elements in Alzheimer’s disease and related dementias
Date of Approval:
September 4, 2025
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
Explore somatic transposable elements and their Alzheimer's disease-related patterns using genomic and phenotypic data from large cohorts:In order to explore the impact of the transposable element in Alzheimer's disease, we propose to conduct a systematic survey in the available large cohorts. The ADSP dataset in NIAGAlzheimer's diseaseS (Accession No. NG00067) includes 16,906 whole-genome sequences and 20,504 whole-exome sequences for case-control and family-based studies of Alzheimer's disease from diverse populations, which is a perfect resource to leverage in this project. Under the support of the Michigan Alzheimer's Disease Center, we will request to access NIAGADS. To detect somatic transposable elements in the ADSP dataset, we will employ established computational pipelines to resolve the transposable elements in the sequencing data, MELT and xTEA for WGS and SCRAMble for WES, respectively. Parameters in these tools, for instance, the calling threshold of supporting reads, will be adjusted accordingly to cooperate with the detection of somatic transposable elements in cells at low frequency. To exclude potential germline transposable elements, we will leverage a master set of polymorphic transposable elements from diverse populations, which are based on our previous projects at the Human Genome Structural Variation Consortium, and the case-control information provided by ADSP. We aim to summarize a spectrum of somatic transposable elements that would be Alzheimer's disease-relevant along with various clinical and phenotypic information. To build Alzheimer's disease-related genetic patterns we will implement Mutect2 (GATK) and Strelka2 to discover SNVs from WGS and WES data and link them with transposable elements in the same haplotype. After obtaining this set of patterns, we will collect phenotypic information from the ADSP dataset to conduct family-based associated analysis and gene-burden analysis. RegulomeDB will be used to annotate the effects of non-coding functional impact and regulatory changes for these Alzheimer's disease-related patterns.
Non-Technical Research Use Statement:
It seeks to explore the connection between the somatic transposable elements in the human genome and Alzheimer’s disease and related dementias. It will leverage large-scale datasets to extensively explore the genome-wide transposable elements and then stratify Alzheimer’s disease-relevant ones by using the rich clinical information from the cohorts. Further analysis pipelines will be built based on the results of the proposed project to investigate the functional impact of these transposable elements on Alzheimer’s disease and would improve the understanding of genetic causes of Alzheimer’s disease and related dementias.

Total number of samples: 151

Female 70 53.4 %

Male 61 46.6 %

Unknown: 0

Neurological and Psychiatric Brain Disorders
Control	48	31.8%
Case	83	55.0%

NG00105 – MiGA – Microglia Genomic Atlas

Overview

Description

Sample Summary per Data Type

Available Filesets

Sample information

Data Releases

Previous Releases:

Related Studies

Cohorts

Consent Levels

Acknowledgement

Acknowledgment statement for any data distributed by NIAGADS:

For investigators using any data from this dataset:

For investigators using MiGA – Microglia Genomic Atlas (sa000018) data:

Related Publications

Approved Users

Total number of samples: 151