NG00119 - Health and Retirement Study Genotype Data 2006-2012

Description

These data include a total of 18,916 subjects from the Health and Retirement Study genotyped on Illumina HumanOmni2.5-arrays. Data files also include imputed data using the 1000 Genomes and the Haplotype Reference Consortium (HRC) reference panels.

Respondents who consented to the saliva collection in 2006 (Phase 1), 2008 (Phase 2), 2010 (Phase 3), or 2012 (Phase 4) have been genotyped using Illumina Omni genotyping platforms. The Phase 1 and 2 participants were genotyped together, and were imputed together previously (see dbGaP accession number phs000428.v1.p1). The Phase 3 participants were subsequently genotyped, and were imputed together with Phases 1-2 (dbGaP accession number phs000428.v2.p2). An additional 3,303 Phase 4 participants were genotyped in 2015, and were imputed together with Phases 1-3, yielding a total of 18,923 unique HRS participants: 15,620 from Phases 1-3, and 3,303 from Phase 4. After QC, there were a total of 18,916 unique HRS participants included in this dataset.

APOE phenotype data for HRS subjects is available at NG00132 – Health and Retirement Study (HRS) APOE and Serotonin Transporter Alleles, and DNA methylation data for HRS subjects is available at NG00153 – Health and Retirement Study (HRS) DNA Methylation. To obtain subject ID mapping between HRS datasets, please submit a Genetic Data Cross-Reference Request Form on the HRS website.

Additional information can be found on the HRS website: https://hrs.isr.umich.edu/data-products/genetic-data

Sample Summary per Data Type

Sample Set	Accession	Data Type	Number of Samples
HRS-All Phases	snd10027	GWAS, 1000G Imputation, HRC Imputation	19,004
HRS-Phase 4	snd10028	GWAS	3,475

Available Filesets

Fileset	Accession	Latest Release	Description
HRS GWAS	fsa000020	NG00119.v1	GWAS Illumina HumanOmni2.5
HRS Imputation	fsa000021	NG00119.v1	1000G Imputation data, HRC Imputation data

View the File Manifest for a full list of files released in this dataset.

Data Dictionary Files

The HRS is a nationally representative sample with oversamples of African-American and Hispanic populations. The target population for the original HRS cohort includes all adults in the contiguous United States born during the years 1931–1941 who reside in households. HRS was subsequently augmented with additional cohorts in 1993 and 1998 to represent the entire population 51 and older in 1998 (b. 1947 and earlier). Since then, the steady-state design calls for refreshment every six years with a new six-year birth cohort of 51–56 year olds. This was done in 2004 with the Early Baby Boomers (EBB) (b. 1948-53) and in 2010 with the Mid Boomers (MBB) (b. 1954–59).

Sample Set	Accession Number	Number of Subjects	Number of Samples
HRS - All Phases	snd10027	15,706	15,706
HRS - Phase4	snd10028	3,366	3,475

Consent Level	Number of Subjects
GRU-IRB-PUB-NPU	18,916

Total number of approved DARs: 38

Investigator:
Belloy, Michael
Institution:
Washington University in St Louis
Project Title:
Elucidating sex-specific risk for Alzheimer's disease through state-of-the-art genetics and multi-omics
Date of Approval:
March 31, 2026
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
• Objectives: In this project, we seek to holistically investigate the genetic and molecular drivers of sex dimorphism in Alzheimer’s disease across ancestries. • Study design: This study integrates large-scale population genetics with multi-omics and endophenotype analyses. We are integrating all data available from ADGC and ADSP, together with other data from AMP-AD and biobanks such as UKB, FinnGen, and MVP to conduct large-scale multi-ancestry GWAS, rare-variant gene aggregation analyses, QTL studies, PWAS, TWAS, etc. We also particularly focus on X chromosome association studies. The study design also interrogates interactions with ancestry, hormone exposures, and with APOE*4, as well as comparisons to non-stratified GWAS/XWAS of Alzheimer’s disease. Further, we will also employ genetic correlation analyses, mendelian randomization, colocalization, and pleiotropy analyses, to interrogate overlap with other complex traits to better understand the mechanisms underlying sex dimorphism in Alzheimer’s disease. • Analysis plan, including the phenotypic characteristics that will be evaluated in association with genetic variants: Our phenotypes will include Alzheimer’s disease risk, conversion risk, various endophenotypes (including amyloid/tau biomarkers, brain imaging metrics, etc.) as well as molecular traits. As noted above, we will conduct large-scale multi-ancestry GWAS, XWAS, rare-variant gene aggregation analyses, QTL studies, PWAS, TWAS, etc. Specific aims include interrogating these question and analyses on (1) the autosomes, (2) the X chromosome, and (3) leveraging sex stratified QTL studies to drive discovery of risk genes.
Non-Technical Research Use Statement:
Alzheimer’s disease (AD) manifests itself differently across men and women, but the genetic and molecular factors that drive this remain elusive. AD is the most common cause of dementia and till today remains largely untreatable. It is thus crucial to study the genetics of AD in a sex-specific manner, as this will help the field gain important insights into disease pathophysiology, identify novel sex-specific risk factors relevant to personalized genetic medicine, and uncover potential new AD drug targets that may benefit both sexes. This project uses large-scale genomics and multi-omics to elucidate novel sex agnostic and sex-specific AD risk genes. We will interrogate sex dimorphism for AD risk on the autosomes and the sex chromosomes. We similarly interrogate sex dimorphism in the genetic regulation of gene expression and protein levels, which we will integrate with genetic risk for Alzheimer’s disease to further discovery risk genes. Throughout, we will also interrogate how sex-specific risk for AD interactions with hormone exposures, ancestry, and the APOE*4 risk allele.
Investigator:
Benjamin, Daniel
Institution:
NBER and UCLA
Project Title:
How health-relevant outcomes are influenced by genetics.
Date of Approval:
May 16, 2024
Request status:
Expired
Research use statements:
Show statements
Technical Research Use Statement:
We will use the HRS data to pursue two complementary strategies. One is the discovery of particular genetic polymorphisms associated with social-science outcomes. Because the effect of an individual genetic polymorphism on the outcome is likely to be very small, the HRS sample is too small, taken alone, to be used to discover new associations. Hence, we will pursue this strategy with HRS data in conjunction with other datasets that we have organized in the Social Science Genetic Association Consortium (SSGAC; www.thessgac.org). Our second strategy focuses on exploiting the uniquely rich social-science data in the HRS. We will conduct analyses that will shed light on the genetic architecture of a range of social-science outcomes. We will apply statistical methods that use the information contained in the dense SNP data taken as a whole and are thus well-powered in a sample size such as that of the HRS. Our specific aims are: 1. To incorporate data from the HRS into ongoing meta-GWAS efforts from the SSGAC for a range of social-science outcomes, such as educational attainment, and personality. 2. To continue to include HRS in the future releases of the Polygenic Index (PGI) Repository. PGIs (aka polygenic scores) are summaries of a person's genetic predisposition to a particular trait. HRS was included in the first release of the Repository, for which we created PGIs for 47 phenotypes in 11 datasets, which were returned to the datasets to be shared with users according to the datasets’ own data sharing procedures. We will regularly update the existing PGIs and add new phenotypes as larger GWAS and better methodologies become available. Details on the Repository can be found in Becker et al. (2021, Resource profile and user guide of the Polygenic Index Repository. Nat. Hum. Behav.). 3. To use the HRS genotype data to conduct polygenic prediction analyses for a range of social-science traits. Besides the direct interest in assessing the degree of predictive power in PGIs, we will examine how these PGIs interact with environmental factors to influence life outcomes. 4. To estimate heritability and genetic correlations for social science traits in an older population.
Non-Technical Research Use Statement:
We will use HRS data to explore the genetic architecture of social-science outcomes. To do so, we will either use HRS data together with other datasets to identify specific genetic variants associated with these outcomes, or analyze the aggregate effect of all genetic variants in HRS alone using heritability analyses and polygenic indexes (PGIs). PGIs are summaries of a person's known genetic predisposition to a particular trait. We will use PGIs to examine the pathways underlying the relationship between genetic variants and outcomes of interest, including analyses of how genes and environment interact. We will also include HRS in future releases of the PGI Repository, an initiative that makes PGIs for a wide range of traits available in a number of datasets that may be useful to social scientists (https://www.thessgac.org/pgi-repository ). HRS was included in the first release of the Repository, and we wish to continue to update the HRS PGIs and add PGIs for new phenotypes as more data or better methodologies become available.
Investigator:
Blue, Elizabeth
Institution:
University of Washington
Project Title:
Genetic modifiers of Alzheimer's disease
Date of Approval:
July 15, 2025
Request status:
Expired
Research use statements:
Show statements
Technical Research Use Statement:
The objective of the proposed research is to identify new genes involved in Alzheimer's disease (AD) by identifying alleles contributing to increased risk for or protection against the disease, providing insight as to why individuals with risk factors develop or escape from AD, and ultimately identifying potential avenues for therapeutic approaches and prevention of the disease. Our study design will use phenotypic (ex., AD diagnosis, age-at-onset, APOE genotype) land genomic data (ex., WGS, array, imputed genotypes) from NIAGADS studies to investigate genotype-phenotype associations. Strategies include association testing and haplotype- and family-based approaches, including estimates of relatedness and population genetics analyses as needed to perform the association testing (ex. control for population structure). NIAGADS data will not be used to investigate individual identity. Consent type and other Data Use Limitations (DUL) for each study will be respected in all analyses. Data from an individual with disease-specific consent will not be used in analyses outside of that restriction, including indirect uses such as imputation reference panels or variant summary statistics. When an individual’s DUL prohibits investigation of population genetics, population history or related issues, their data will be excluded from studies that address those issues. We intend to publish or otherwise broadly share any findings from this study with the scientific community. As such, genomic summary results from datasets with a “sensitive” designation will only be shared through publications to support study’s conclusions and through NIH-funded data repositories which maintain restricted access (ex. NIAGADS). Data from NIAGADS may be combined with non-NIAGADS data from the same or other studies (obtained from dbGaP or other sources), to improve the power for novel genetic discoveries, while respecting the consent of all participants. We expect that this activity creates no additional risks to participants. Data will be shared only among Internal Collaborators at the University of Washington. We do not plan to collaborate with External Collaborators at other institutions.
Non-Technical Research Use Statement:
We propose to identify new genes involved in Alzheimer's disease (AD) by identifying alleles contributing to increased risk for or protection against the disease, providing insight as to why individuals with risk factors develop or escape from AD, and ultimately identifying potential avenues for therapeutic approaches and prevention of the disease. We will combine phenotype and genotype data using association testing and haplotype- and family-based approaches to identify associations and refine those signals with fine-mapping tools and external data.
Investigator:
Brown, Rebecca
Institution:
University of Pennsylvania
Project Title:
Trajectories of Cognition in Middle Age: Implications for Alzheimer's Disease and Related Dementias in the U.S.
Date of Approval:
March 16, 2026
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
Polygenic risk scores (PRS) for dementia and aging-related conditions are known to be associated with cognitive outcomes in older age, but little is known about their relationship to mid-life cognitive decline. We plan to use raw genetic data to derive novel PRS from GWAS sources (including Lambert Alzheimer’s disease PRS, with and without APOE; aPRS for coronary artery disease; a longevity PRS) and evaluate their predictive accuracy for cognitive outcomes in middle age relative to existing PRS. Specifically, we want to create a measure of genetic risk associated with three outcomes: age-related cognition; telomere shortening; and methylation/epigenetic clocks. To achieve this, we will combine the HRS Genotype data with other HRS datasets (Harmonized Cognitive Assessment Protocol (HCAP) (2016 Early V1.0); 2008 Telomere Data; Epigenetic Clocks; 2016 Venous Blood Study (VBS)) to which we already have access. Once we have approved NIAGADS genomics data access, we will additionally request access to the HRS-NIAGADS Cross-Reference File (Genotype Data v3,2006-2012) to link the genomics and HRS datasets. In our ongoing analyses, we would like to update our PRS models by incorporating the most recent GWAS summary statistics. For Alzheimer's disease, this requires access to the full summary statistics from the Kunkle et al., 2019 GWAS. We also would like access to the full summary statistics from the Farrell et al., 2024 GWAS and the Rajabli et al., 2025 GWAS to identify genetic modifiers of tauopathy by comparing progressive supranuclear palsy GWAS results with cross-ancestry Alzheimer’s disease GWAS results.
Non-Technical Research Use Statement:
There is evidence to suggest that differences in people’s genetic code might contribute to differences in age-associated cognitive changes. For example, some people develop memory problems in middle age, and other people experience no changes in memory. Researchers think this may be partially explained by differences in people’s genetic code. We might be able to predict who could experience age-related cognitive changes based on their DNA sequence. If we know which people have experienced memory problems, we can see what their DNA has in common compared to the DNA of people who don’t have any memory problems. Then, we can test this by looking at the DNA of a different group of people; evaluating if their DNA has the same things in common as the group of people with memory problems (vs. no memory problems); and predicting whether they will develop memory problems. The long-term goal of this work is to help identify people who might be at risk for developing memory problems and help them access preventative care or interventions to minimize future cognitive impairment.
Investigator:
Chen, Jingchun
Institution:
University of Nevada, Las Vegas
Project Title:
Classification of Alzheimer’s disease with Genetic Data and Artificial Intelligence
Date of Approval:
November 14, 2025
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
Alzheimer's disease(AD) is the most common cause of dementia, accounting for 60% to 80% of cases that affect over six million people in the United States. The disease gradually progresses from mild cognitive impairment(MCI) to dementia, which takes more than a decade. Identifying individuals who have a high risk of AD earlier is essential for AD prevention and intervention. As the heritability of AD is high(up to 79%), genetic data should be powerful to identify individuals at high risk. Indeed, polygenic risk score (PRS), designed to estimate individual genetic liability by integrating large GWAS summary statistics and individual genotype data, has been shown to be promising for AD risk prediction(AUCs up to 84%). However, the prediction accuracy using a single PRS is still not sufficient for MCI and AD classification in clinical practice. We hypothesize that convolution neural network(CNN) models can improve the classification of AD and MCI by multiple integrating PRSs from multiple traits, multi-omics data (genotyping data, scRNA-seq), clinical data, and imaging data. The objective is to develop advanced AI algorithms and build data-driven models for disease risk assessment, earlier identifying individuals with high risk for MCI and AD. Our long-term goal is to develop and validate a prediction model that can be translated into clinical practice. Our CNN model has recently shown an improved performance for AD with PRSs from multiple traits(AUC 92.4%). We want to extend our approach to predicting AD and MCI in different ethnic groups and validate the results with independent datasets. To this end, we would like to apply for multi-omics data in NG00067.v9 from https://dss.niagads.org/datasets/ng00067/. With an extensive experience in genetic studies on complex disorders and disease modeling, we are confident that we will achieve the specified goals and promote the integration of genetic data with AI algorithms, facilitating data-driven, personalized care of AD. We expect to finish this study within 2 years with publication and grant application. We have IRB approval and will follow the rules for data sharing and acknowledgment.
Non-Technical Research Use Statement:
Alzheimer’s disease (AD), the most common form of dementia, that usually develops from mild cognitive impairment to dementia. There is currently no treatment to slow the progression of this disorder. But earlier identification of the individuals with higher risk maybe critical to prevent the disease. We propose a new approach to create models for classification of AD and MCI with artificial intelligence and genetic data. This study will have a significant value in personalized medicine for AD risk assessment, classification, and earlier intervention.We don’t have the planned collaboration with researchers outside Cleveland Clinic in the current analytic plans.
Investigator:
Conley, Dalton
Institution:
Princeton University
Project Title:
The sociogenomics of human phenotypes: How social and biological factors jointly shape individual behaviors and outcomes related to socioeconomic attainment and demographic outcomes.
Date of Approval:
October 24, 2025
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
Genetics has been increasingly integrated into research on sociological topics such as fertility, well-being, risk-taking, and longevity (Mills and Tropf, 2020). While the existing literature has established heritability of a set of sociopsychological, behavioral, and health outcomes, estimates from different cohorts or study designs —e.g., twin and family studies, GWAS, and SNP heritability — differ substantially. Missing/hidden heritability prompts discussion on the optimal methodological approach. We seek to understand how the use of family designs impacts the validity of genetic effect estimates. Specifically, we will compare the performance of classic and family-based GWAS and their downstream polygenic scores (PGSs) in predicting a rich set of sociopsychological, behavioral, and health outcomes. Additionally, we will explore to what extent family-based GWAS results yield increased portability to diverse and admixed ancestries.Another important area is gene-environment interactions (G×E). G×E research has employed diverse approaches (Miao et al., 2022), such as differential heritability/variance, genetic correlation, and mean/variance PGS (Johnson et al., 2022) analyses. Despite this, limitations remain; many (early) G×E studies fail to properly control for potential confounders (Keller, 2014). Moreover, which G×E mechanisms — e.g., outcome moderation (i.e., Domingue et al. 2020) vs. variability moderation —underlie the effects is poorly understood. In addition, little is known about the extent which to social changes serve to modify associations between genetic ancestry and self-identified/reviewer classified race. We aim to employ recent methodological advances to the multiple research gaps described above. We will examine a rich set of variables, including SES, early-life experiences, physical development, mental health, medical conditions, and mortality. This work will be collaborative with Professor Sam Trejo, also of Princeton University (Sociology).
Non-Technical Research Use Statement:
This project aims to increase understanding of how genetic and socioenvironmental factors interactively affect social, behavioral, and health outcomes, with an eye towards gaps in the research literature. For one thing, existing efforts at quantifying the genetic effects on individual behaviors/outcomes have come to sometimes substantially different estimates. For another, many existing G×E studies have been improperly designed to answer their intended research question, and few of them have specifically examined which GxE mechanisms explain the observed patterns.This project can help us better understand the biosocial underpinnings of a rich set of individual outcomes and inform policies aimed at reducing social/health disparities. Our research improves the development of tools that identify individuals for early intervention, suggests how the DNA characteristics of a population may influence the effectiveness of health policies, and facilitate evidence-based policymaking that considers not only socioenvironmental factors but also their interactions with one’s gene’s.
Investigator:
Crimmins, Eileen
Institution:
University of Southern California
Project Title:
GWAS and Systems Biology Analyses for Aging-Related Conditions: Longevity and Disease
Date of Approval:
March 31, 2026
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
Research Use Statement: Our project will rely on phenotype and genotype data from the Health and Retirement Study (HRS), a nationally representative longitudinal study of the older adult population in the U.S. This is an on-going study. Data we have been using beginning in 2016 are from an approved application through dbGaP, from 15,507 HRS participants and include single nucleotide polymorphism (SNP) data on just under 2.5 million markers, imputed data on approximately 21 million DNA variants, and phenotype data on disease incidence and prevalence, functioning, biomarkers, mortality, and environmental and behavioral covariates. Our request from NIAGADS would provide us with an additional genetic sample to what we have been using, for the additional data on 3,409 participations (yielding N=18,916 total with harmonized genetic data through NIAGADS). Data usage will not create additional risk to participants. Aims of the project are to (1) Identify genetic networks and pathways that influence human aging, disease, functioning, and longevity; (2) Develop predictive models of aging-related health outcomes using information from gene networks; and (3) Examine how social and environmental conditions interact with genes within these aging-related gene networks. We will implement statistical models to test for associations between genetic variants and the same phenotype data. In moving forward with the additional samples, we will use the HRS genome-wide data to examine genetic signatures of healthspan, lifespan, and cognitive aging. Using these genetic signatures, we plan to (i) run pathway enrichment analysis to identify influential biological pathways, (ii) use them for predictive modeling of morbidity/mortality risk and cognitive aging, and (iii) incorporate information from social and behavioral data to examine GxE interactions. The overall goal of the project is to identify mechanistic gene and environment networks that contribute to aging acceleration or deceleration.
Non-Technical Research Use Statement:
Non-Technical Summary: Aging is the largest risk factor for morbidity and mortality. Previous research using animal models or case-control studies of centenarians have suggested that variations in the pace of aging may be partially explained by genetic and genomic differences. However, few genetic regulators of human lifespan and healthspan have been identified. Furthermore, there is reason to suggest that the pace of aging may be a polygenic trait, for which multiple genes form complex networks that collectively influence aging and longevity phenotypes. These complex genetic networks may further interact with exogenous factors causing variation to arise in health outcomes under diverse environments. The goal of this project is to use advantaged statistical modeling techniques to understand how gene-gene and gene-environment interactions influence longevity and aging-related conditions.
Investigator:
Cruchaga, Carlos
Institution:
Washington University School of Medicine
Project Title:
The Familial Alzheimer Sequencing (FASe) Project
Date of Approval:
January 21, 2026
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
The goal of this study is to identify new genes and mutations that cause or increase risk for Alzheimer disease (AD), as well as protective factors. Individuals and families were selected from the Knight-ADRC (Washington University) and the NIA-LOAD study. Only families with at least three first-degree affected individuals were included. Families with pathogenic variants in the known AD or FTD genes, or in which APOE4 segregated with disease were excluded. At least two cases and one control were selected per family. Cases had an age at onset (AAO) after 65 yo and controls had a larger age at last assessment than the latest AAO within the family. Whole exome (WES) and whole genome sequencing (WGS) was generated for 1,235 individuals (285 families) that together with data from our collaborators and the ADSP family-based cohort (3,449 individuals and 757 families) will provide enough statistical power to identify new genes for AD. Dr. Tanzi (Harvard Medical School) will provide WGS from 400 families from the NIMH Alzheimer disease genetics initiative study. We will perform single variant and gene-based analyses to identify genes and variants that increase risk for disease in AD families. Single variant analysis will consist of a combination of association and segregation analyses. We will run family-based gene-based methods to identify genes that show and overall enrichment of variants in AD cases. We will also look for protective and modifier variants. To do this we will identify families loaded with AD cases, that also include individuals with a high burden of known risk variants but that do not develop the disease (escapees). We will use the sequence data and the family structure to identify variants that segregate with the escapee phenotype. The most promising variants and genes will be replicated in independent datasets (ADSP case-control, ADNI, Knight-ADRC, NIA-LOAD ). We will perform single variant and gene-based analyses to replicate the initial findings, and survival analysis to replicate the protective variants. We will select the most promising variants/genes for functional studies
Non-Technical Research Use Statement:
Family-based approaches led to the identification of disease-causing Alzheimer’s Disease (AD) variants in the genes encoding APP, PSEN1 and PSEN2. The identification of these genes led to the A?-cascade hypothesis and to the development of drugs that target this pathway. Recently, we have identified rare coding variants in TREM2, ABCA7, PLD3 and SORL1 with large effect sizes for risk for AD, confirming that rare coding variants play a role in the etiology of AD. In this proposal, we will identify rare risk and protective alleles using sequence data from families densely affected by AD. We hypothesize that these families are enriched for genetic risk factors. We already have sequence data from 695 families (2,462 individuals), that combined with the ADSP and the NIMH dataset will lead to a dataset of more than 1,042 families (4,684 individuals). Our preliminary results support the flexibility of this approach and strongly suggest that protective and risk variants with large effect size will be found, which will lead to a better understanding of the biology of the disease.
Investigator:
Fan, Maoyong
Institution:
Ball State University
Project Title:
How does stock market fluctuations affect senior citizens' portfolio choices?
Date of Approval:
December 18, 2024
Request status:
Closed
Research use statements:
Show statements
Technical Research Use Statement:
Objectives: exploring the causal effect of stock market fluctuations on senior citizen's portfolio choices using the Health and Retirement Survey (HRS)Study design: We investigate how does stock market ups and downs affect people's investment decisions. Then, we examine how the relationship between individual portfolio choices and stock market returns is affected by social economic determinants and genetic markers associated with risk-taking behavior. Our goal is to analyze genetic information, stock market fluctuations, and portfolio choices and determine how genetic markers impact seniors' financial decision-making under different market conditions.Analysis plan: We use HRS to construct variables that reflect individual's financial assets, including stocks, bonds, and other investment. We collect data on stock market returns from CRSP and COMPUSTAT, and create national-level and state-level market returns. By comparing individual's portfolio choices at different year (corresponding to different market returns) or comparing people's portfolio choices across states (corresponding to different state-level returns), we examine how each individual's portfolio choices change as the stock market fluctuates. we then examine how education or cognition (represented by education- and cognition-related genetic variants) and risk preferences (risky behavior-related genetic variants) affect impact seniors' financial decision-making under different market conditions. For example, we use an instrument variable (IV) approach to isolate random variation in financial literacy and education and estimate causal effects of education on portfolio choices among older adults. The IVs are constructed from individual’s genetic variants, either key single nucleotide polymorphisms (SNPs) or the polygenic score (PGS).
Non-Technical Research Use Statement:
The objective of this study is to link genetic information, stock market shifts, and portfolio choices to understand how genetic markers affect senior citizens' financial decisions under varying market circumstances using the Health and Retirement Surveys (HRS). The study is designed to scrutinize the effect of market volatility on investment choices, and how this connection is further impacted by socioeconomic factors and genetic markers linked to financial literacy and risk-taking behavior. The findings can be used to inform policy and financial education initiatives that target senior citizens and promote healthy financial decision-making. Additionally, the study can highlight the importance of considering genetics in financial decision-making and its potential implications for financial advisors and investment managers. Additionally, the study can highlight the importance of considering genetics in financial decision-making and its potential implications for financial advisors and investment managers.
Investigator:
Farrer, Lindsay
Institution:
Boston University
Project Title:
ADSP Data Analysis
Date of Approval:
June 25, 2026
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
As part of the Consortium for Alzheimers Sequence Analysis (CASA: NIA grant UF1-AG047133), we plan to analyze whole exome and whole genome sequence data generated from subjects with Alzheimer's disease (AD) and elderly normal controls. These data will be generated by the National Human Genome Institute Large-Scale Sequence Program. The goal of the planned analyses is to identify genes that have alleles that protect against or increase susceptibility to AD. We will evaluate variants detected in the sequence data for association with AD to identify protective and susceptibility alleles using the whole exome case-control data. We will also evaluate sequence data from multiplex AD families to identify variants associated with AD risk and protection, and evaluate variant co-segregation with AD. The family data will be whole genome data. The family-based data will be used to inform the cases control analysis and visa versa. We also will focus on structural variants (insertion-deletions, copy number variants, and chromosomal rearrangements). Evaluation of structural variants will involve both whole genome and whole exome data. Structural variants will be analyzed with single nucelotide variants detected and analyzed in the case-control and family-based data.
Non-Technical Research Use Statement:
We are attempting to identify all the inherited elements that contribute to Alzheimer's disease risk. To do this we will analyze DNA sequence data from subjects with Alzheimer's disease and elderly subjects who are cognitively normal. The sequence data from these 2 groups will be compared to identify differences that contribute to the risk of developing Alzheimer's disease of that protect against Alzheimer's disease. These DNA differences can be at a single site in the genetic code, or can span multiple sites, changing the copy number of DNA sequences. Both types of genetic variants will be examined.
Investigator:
Greicius, Michael
Institution:
Stanford University School of Medicine
Project Title:
Examining Genetic Associations in Neurodegenerative Diseases
Date of Approval:
March 31, 2026
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
We are studying the effects of rare (minor allele frequency < 5%) genetic variants on the risk of developing late-onset Alzheimer’s Disease (AD). We are interested in variants that have a protective effect in subjects who are at an increased genetic risk, or variants that lead to multiple dementias. Our aim is to identify any genetic variants that are present in the “case” group but not the “AD control” groups for both types of variants. The raw data we receive will be annotated to identify SNP locations and frequencies using existing databases such as 1,000 Genomes. We will filter the data based on genetic models such as compounded heterozygosity, recessive and dominant models to identify different types of variants.
Non-Technical Research Use Statement:
Current genetic understanding of Alzheimer’s Disease (AD) does not fully explain its heritability. The APOE4 allele is a well-established risk factor for the development of Alzheimer’s Disease (AD). However, some individuals who carry APOE4 remain cognitively healthy until advanced ages. Additionally, the cause of mixed dementia pathology development in individuals remains largely unexplained. We aim to identify genetic factors associated with these “protected” and mixed pathology phenotypes.
Investigator:
Hu, William
Institution:
Rutgers Biomedical and Health Sciences
Project Title:
Genomic and social determinants of cognitive decline and resilience in the Health and Retirement Study
Date of Approval:
June 1, 2026
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
Both positive (social engagement, leisurely activities) and negative (isolation, widowhood) social determinants have been identified to influence cognitive decline in the Health and Retirement Study, but known genetic risk factors for Alzheimer’s disease, cardiovascular disease, and frailty are often not taken into account. Leveraging the expertise of the Asian Resource Center for Minority Aging Research, we propose to examine impact of introducing genomic markers into our current models linking social determinants and cognitive decline, and identify interactions predictive of vulnerability as well as resilience to cognitive decline
Non-Technical Research Use Statement:
People’s behaviors can contribute to or compensate for genetic risks for age-related conditions such as dementia and frailty, and we will identify positive and negative behaviors associated with genetic risks for Alzheimer’s disease and related conditions.
Investigator:
JIANG, RONG
Institution:
Duke Health
Project Title:
Gene and Stress on Hearing Loss in Older Adults
Date of Approval:
December 4, 2025
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
The objective of this research is to investigate the genetic and epigenetic mechanisms underlying hearing loss (HL), with a specific focus on how psychosocial stress and related environmental exposures interact with genetic/epigenetic markers to affect HL risk and progression, especially in older adults. We will conduct a cohort-based study design leveraging HRS dataset, integrating genomic, epigenomic (e.g. DNA methylation), and environmental data to evaluate both main effects and interactions. Our analysis plan will include (1) genome-wide association studies (GWAS) and epigenome-wide association studies (EWAS) of HL; (2) joint tests of genetic, epigenetic, and environmental factors, with a particular focus on psychosocial stress, and (3) integrative approaches to develop a predictive model to improve HL prediction and identify at-risk subgroups. The primary phenotypes of interest include self-rated hearing difficulties, and auditory measures from pure tune audiometry test. Psychosocial stress includes measures from Leave Behind Questionnaires, with sociodemographic, lifestyle and health variables. This project will improve our understanding of how genetic susceptibility and stress exposures jointly impact HL risk, with the long-term goal of identifying biomarkers for early detection and potential intervention targets. At this stage, no external collaborators are planned, though we anticipate future opportunities for collaboration to validate findings across institutions or cohorts.
Non-Technical Research Use Statement:
We plan to study how genes and epigenetic changes (such as DNA methylation) interact with stress to affect hearing loss (HL) in older adults. Using Health and Retirement Study (HRS) data, we will investigate whether stress and genetics together contribute to higher risk of hearing problems. This research could help identify people at greater risk and provide insights into strategies to prevent or reduce HL through both medical and public health approaches.
Investigator:
Kim, Jong Hun
Institution:
KOREA UNIVERSITY RESEARCH AND BUSINESS FOUNDATION
Project Title:
Discovery of APOE-Interacting Genes Through Trans-Ancestry and Sex-Stratified Analysis to Elucidate Alzheimer's Disease Risk Mechanisms and Stratify ARIA Risk Using Proxy Outcomes
Date of Approval:
July 20, 2026
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
Objectives: This project identifies ancestry- and sex-specific APOE ε4 modifier genes—variants that amplify or attenuate APOE ε4’s effect on AD risk and ARIA susceptibility from anti-amyloid immunotherapy. Aim 1: Trans-ancestry sex-stratified GWIS to construct an APOE-Wide Epistasis Map. Aim 2: Mechanistic validation via eQTL/pQTL colocalization and epistasis network. Aim 3: Explainable AI (XAI) integrating modifier SNPs, multi-omics subtypes, and ARIA proxy outcomes to stratify pre-treatment ARIA risk. Study Design: Multi-cohort secondary analysis using NIAGADS-controlled ADSP data exclusively. Individual-level data from all 15 ADC cohorts (NG00022–NG00151) and multi-ancestry ADSP WGS (NG00067, NG00166) span European, African American, Hispanic/Latino, and South/East Asian ancestries. Functional datasets (eQTL/pQTL: NG00102, NG00118, NG00120, NG00130) support Aim 2; imaging and neuropathology datasets (NG00103, NG00147, NG00175) enable Aim 3 ARIA proxy development. No prospective recruitment. Multi-dataset rationale: GWIS requires 4–8× more samples than standard GWAS (Gauderman 2002); no single cohort is independently powered—all 15 ADC cohorts must be pooled. Trans-ancestry GWIS requires ancestry-matched datasets (NG00100/African, NG00106/South Asian, NG00141/Hispanic) because population-specific LD cannot be imputed from summary statistics. Functional datasets (eQTL, pQTL, methylation) are non-redundant—each covers a distinct regulatory layer for Aim 2. All datasets are AD-specific; non-AD neurodegeneration data are excluded. Analysis Plan: Phenotypes: AD case/control (primary); APOE ε4 × SNP interaction; lobar microbleed count (ARIA-H proxy); SVD score (WMH, lacunar infarcts, perivascular spaces); longitudinal cognitive decline. Covariates: age, sex, top 20 ancestry PCs, stratum. Methods: logistic GWIS; trans-ancestry meta-analysis (METAL/MR-MEGA); sex-stratified/X-chromosome analyses; eQTL/pQTL colocalization (COLOC2/SMR); XGBoost XAI with 5-fold CV and SHAP.
Non-Technical Research Use Statement:
Alzheimer’s disease affects tens of millions worldwide. Lecanemab, approved in 2024, slows Alzheimer’s progression by removing amyloid plaques—but causes dangerous brain side effects (ARIA: Amyloid-Related Imaging Abnormalities) especially in APOE ε4 carriers, who also most need treatment. Currently, doctors cannot predict which APOE ε4 carriers will benefit versus be harmed.Our research identifies modifier genes controlling how dangerous APOE ε4 is. We leverage the ADSP’s diverse dataset spanning 15+ cohorts across European, African American, Hispanic/Latino, and Asian ancestries—a scale statistically necessary because detecting gene–gene interactions requires 4–8× more samples than standard genetic studies. Population-specific patterns allow high-confidence modifier identification. MRI-based brain bleeds and vascular markers serve as validated ARIA surrogates available at scale.The result is an explainable AI tool that predicts—before treatment begins—which APOE ε4 patients face high ARIA risk and which will benefit from lecanemab, enabling precision Alzheimer’s therapy.
Investigator:
Konermann, Silvana
Institution:
Arc institute
Project Title:
Modeling Alzheimer’s disease risk and associated molecular phenotypes
Date of Approval:
August 8, 2025
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
The objective of the proposed research is to determine the relationship between Alzheimer’s disease (AD) genetic risk and associated molecular phenotypes. Genotype data will be used to compute a polygenic risk score (PRS) for disease-affected and control (non-disease-affected) participants. Statistical regression and mediation analyses will be used to model variation of molecular phenotypes with respect to PRS and, where available, pathology stage or cognitive impairment. Molecular phenotypes to be analyzed include bulk/single-cell/single-nucleus transcriptome, epigenome, proteome, metabolome, lipidome, amyloid, and tau. Molecular phenotypes of participants, including controls, will be matched with molecular phenotypes of in vitro cellular models, informing the design of in vitro perturbation experiments that recapitulate the genetic drivers of AD risk.
Non-Technical Research Use Statement:
Our goal is to determine the relationship between human genetic profiles associated with Alzheimer’s disease (AD) risk and specific measurable characteristics of human cells. Using multiple statistical analysis methods, we will build quantitative models that describe how those characteristics vary as a function of AD genetic risk. The models we build will help us design in vitro cellular systems that reflect different levels of AD risk, enabling experiments that inform new strategies for treating or preventing AD.
Investigator:
Lee, Brian
Institution:
Drexel University
Project Title:
LEGENNDS
Date of Approval:
August 12, 2025
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
The primary outcome variable of interest is the presence of dementia as determined from the core HRS interview that includes a Telephone Interview for Cognitive Status (TICS). Secondary outcomes of interest include cognitive scores and change in scores. The primary predictor variables of interest are polygenic risk scores for autism, ADHD, and intelligence. The covariates considered in this study include age at study entry, sex, race/ethnicity, socioeconomic status at baseline, education, and genetic ancestry. Cox proportional hazards models will evaluate time to event, operationalized as the time from study entry to the time of the first instance of dementia status. Those who did not receive a status of dementia by the end of the follow-up period were termed censored. Similarly, individuals who died before receiving a diagnosis of dementia were also censored, as well as those who were lost to follow-up. Each censored individual also had a time to event from the time of study entry to the time of censoring. Statistical models will estimate the association of polygenic risk scores for autism, ADHD, and intelligence and dementia/cognitive scores and change in scores. Our collaboration includes researchers at U Pitt (Andrea Rosso, Yicheng Cai) and the University of Haifa (Stephen Levine).
Non-Technical Research Use Statement:
Emerging evidence suggests that certain neurodevelopmental disorders – autism, attention-deficit hyperactive disorder (ADHD), and intellectual disability – may increase the risk of later life neurodegenerative disorders such as Alzheimer’s disease or related dementia (ADRD). The goal of this proposed study is to elucidate the genetic link between autism, ADHD, and intellectual disability and ADRD. As part of this, we will examine the relationship between genetic susceptibility to neurodevelopmental disorders and future risk of ADRD. This study will make use of genetic data from over 18,000 participants in the Health and Retirement Study.
Investigator:
Lee, James
Institution:
University of Minnesota
Project Title:
Recent Selection for Behavioral Traits
Date of Approval:
June 6, 2024
Request status:
Closed
Research use statements:
Show statements
Technical Research Use Statement:
Objectives of the proposed Research: 1) To test for and measure secular trends for a range of traits in humans, with a focus on behavioral and health traits.2) To test whether the strength and direction of these trends changes between generations.3) To test hypothesized mediators, including age at first birth and SES.Study design and Analysis Plan: Using cohorts that have completed their fertility we will run regressions with the fertility rate as the dependent variable and polygenic scores as the independent variable. From this we will calculate the selection differential and the strength of selection, after adjusting for the missing variance of the polygenic scores. We will test the role of different moderators by splitting the sample according to the moderators. Analyses will be done and reported with and without the use of sampling weights. We intend to study selection of a range of behavioral and health related traits including the Big Five personality traits, occupational status, ADHD, BMI, educational attainment, cognitive performance, smoking cessation, smoking initiation, height, schizophrenia, depression and autism. We will derive our own polygenic scores from available summary statistics, not limiting ourselves to what is available in the Polygenic Score Data provided by HRS.Secondary analyses will include: 1) measuring change in the polygenic scores between cohorts, with a focus on the difference between those born before, during and after the Second World War. 2) Estimate genotypic change using phenotypes available in the HRS that are closest to our genotypic traits, which include the Big Five personality traits, occupational status, ADHD, BMI, educational attainment, cognitive performance, smoking cessation, smoking initiation, height, schizophrenia, depression and autism.We will not collaborate with researchers from other institutions.
Non-Technical Research Use Statement:
Many traits affect and are associated with the number of children we have. Illnesses and education can get in the way of reproduction, for example. This results in our culture, society and environment selecting for certain traits in future generations. Although the speed of this process is extremely slow, its direction and exact strength is unclear for many traits. We would like to measure this effect.
Investigator:
Lu, Qiongshi
Institution:
University of Wisconsin-Madison
Project Title:
Dissect the genetic architecture for sociological traits through integrative analysis of GWAS and functional annotations
Date of Approval:
February 29, 2024
Request status:
Expired
Research use statements:
Show statements
Technical Research Use Statement:
Genome-wide association studies (GWAS) have identified tens of thousands of associations for numerous complex traits. However, despite the identifications of associated genetic variants, interpretation of GWAS findings remains challenging. The complex structure of linkage disequilibrium in the human genome, coupled with weak effect sizes of common genetic variants, hinder our ability to identify biologically functional genetic variants and understand their functional mechanism. Recent advances in epigenetic and transcriptomic functional annotations have accelerated discoveries in a variety of human genetics applications including GWAS downstream analysis. In this project, we leverage integrative genomic functional annotations in GWAS data to dissect the genetic architecture of complex traits. Specifically, we will integrate the requested GWAS data with epigenetic and transcriptomic annotation data in public repositories (e.g. Epigenomics Roadmap Project, ECNODE, and GTEx) to explore the underlying genetic architecture of various sociogenomics traits available in the HRS, examine shared genetic components among these traits, leverage pleiotropy and functional annotation information to prioritize genetic variants affecting these phenotypes, and robust and interpretable produce genetic prediction models. We think that integrating functional annotation information can effectively reduce noises and spurious associations in the non-functional regions in the human genome. More importantly, the tissue-specific nature of epigenetic and transcriptomic data would provide novel insights into the genetic basis and functional pathways of sociogenomic phenotypes. Finally, using better prioritized variants and annotation-informed effect size estimates can improve the prediction accuracy of polygenic risk score, which enhances the statistical power in studying the genetic relationship among multiple phenotypes.
Non-Technical Research Use Statement:
Overwhelming evidence indicates that common genetic variants account for a substantial proportion of phenotypic variance in many complex behavioral phenotypes. As a systematic and robust approach, GWAS can effectively identify genetic variants associated with human traits. In this study, we employ genetic data from HRS to identify genetic variants associated with a variety of sociological phenotypes. Then, we will apply state-of-the-art statistical and computational methods to help interpret our findings. Specifically, we will integrate external annotation information of the human genome to fine-map causal variants at identified genetic loci, identify related tissue and cell types for sociological traits, and identify candidate risk genes. Further, by jointly modeling multiple traits, we dissect the shared and distinct genetic architecture among related sociological traits.
Investigator:
Mather, Karen
Institution:
UNSW Sydney
Project Title:
Investigating the relationships between polygenic risk scores and dementia and cognition across and within populations of different ancestry
Date of Approval:
December 19, 2025
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
Background: The Cohort Studies of Memory in an International Consortium (COSMIC) studies the factors linked to cognitive decline and dementia in a diverse range of populations from around the world. COSMIC is headed by the Centre for Healthy Brain Ageing (CHeBA), University of New South Wales (UNSW), Australia (see https://cheba.unsw.edu.au/consortia/cosmic).At present, there is limited knowledge available regarding the genetic factors associated with ageing-related complex phenotypes and diseases in non-European based cohorts, particularly in low- and middle-income countries and whether specific ancestry-based genetic association results are generalizable to populations of other ancestries.In this study, we aim to study the genetic factors associated with dementia and related phenotypes to appraise if they can be used to predict age-related cognitive performance and decline and dementia in a wide range of diverse cohorts. We will use data collected by COSMIC Consortium studies but also from external studies wherever possible. Hence, the application to assess NIAGADS data to include as many ancestry-diverse studies as we can in this work. The data from these cohorts/studies will be analyzed by meta-analysis.Objectives: To assess if dementia/cognitive and other polygenic risk scores (PRS) generated from different ancestries (European, non-European and trans ethnic) predicts age-related cognitive performance and decline and dementia across populations of different ancestries and including studies from low and middle-income countries.Study Design and Analysis Plan: Participants will be adults aged 45 and above without dementia at baseline. Different PRS (constructed using different methods, eg.SBayesRC and using different GWAS p-value thresholds using PLINK) will be undertaken using available GWAS summary statistics. Cognitive data both cross-sectionally and longitudinally will be used where available, with priority given to tests of memory. PRS-cognitive analyses will be performed using appropriate mixed models. Covariates will include age, sex, years of education and any study-specific covariates (e.g. PCs for population stratification).
Non-Technical Research Use Statement:
Most human genetic association studies have been undertaken in populations of European ancestry, despite >75% of the world’s population being of Asian or African ancestry. To date, most genetic variants for dementia have been identified using populations mainly of European ancestry and from high income countries, despite more than ~60% of dementia cases living in low and middle-income countries. In addition, many of the non-white genetic studies have had small sample sizes and lack replication. We need to increase our understanding of the genetic risk for dementia and its related traits in under-represented populations, such as the multi-ethnic cohorts of the Cohort Studies of Memory in an International Consortium (COSMIC). The current project aims to examine if polygenic risk scores for dementia, cognitive and other related phenotypes generated from populations of different ancestries predict performance and decline on cognitive tests and incident dementia in older adults from multi-ethnic populations using the COSMIC Consortium and external studies.
Investigator:
Monti, Stefano
Institution:
Boston University
Project Title:
Longevity Consortium
Date of Approval:
January 30, 2026
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
We study human exceptional longevity and healthy aging and we have generated many results that connect genetic variants to these traits, to intermediate molecular profiles, including gene expression, serum proteomics and metabolomics, and to nutrition. We would like to use the genetic, methylation and phenotypic data generated in the HRS for replication of our results linking genetics, multiomics profiles of cognitive aging, and nutrition. We will use these data to investigate the associations between genetic variants, multi-omics profiles, longevity and healthy cognitive aging, and nutrition using mixed effect models adjusted by genome-wide principal components and including random effects with variance covariance matrix that depends on the genetic relation matrix. We will investigate the relations between genetic variants and multi-omics profile using our pipeline for yQTL that uses mixed effect models as above. We will integrate the various results using mediation analysis. We will ask for access to deidentified data and not biological specimens.
Non-Technical Research Use Statement:
Our research focuses on understanding why some people live longer and age more healthily, especially when it comes to brain function and nutrition. The study has found many links between genes and these healthy aging traits by looking at different biological factors like gene activity and blood proteins. We aim to leverage the HRS data to confirm our findings. We will analyze this information to explore how genes, biological markers, and nutrition work together to influence long life and healthy brain aging. The data used will be anonymous, ensuring privacy, and no physical samples will be needed.
Investigator:
Pan, Wei
Institution:
University of Minnesota
Project Title:
Powerful and novel statistical methods to detect genetic variants associated with or putative causal to Alzheimer’s disease
Date of Approval:
March 25, 2025
Request status:
Expired
Research use statements:
Show statements
Technical Research Use Statement:
We have been developing more powerful statistical methods to detect common variant (CV)- or rare variant (RV)-complex trait associations and/or putative causal relationships for GWAS and DNA sequencing data. Here we propose applying our new methods, along with other suitable existing methods, to the existing ADSP sequencing data and other AD GWAS data provided by NIA, hence requesting approval for accessing the ADSP sequencing and other related GWAS/genetic data. We have the following two specific Aims: Aim1. Association testing under genetic heterogeneity: For complex traits, genetic heterogeneity, especially of RVs, is ubiquitous as well acknowledged in the literature, however there is barely any existing methodology to explicitly account for genetic heterogeneity in association analysis of RVs based on a single sample/cohort. We propose using secondary and other omic data, such as transcriptomic or metabolomic data, to stratify the given sample, then apply a weighted test to the resulting strata, explicitly accounting for genetic heterogeneity that causal RVs may be different (with varying effect sizes) across unknown and hidden subpopulations. Some preliminary analyses have conﬁrmed power gains of the proposed approach over the standard analysis. Aim 2. Meta analysis of RV tests: Although it has been well appreciated that it is necessary to account for varying association effect sizes and directions in meta analysis of RVs for multi-ethnic cohorts, existing tests are not highly adaptive to varying association patterns across the cohorts and across the RVs, leading to power loss. We propose a highly adaptive test based on a family of SPU tests, which cover many existing meta-analysis tests as special cases. Our preliminary results demonstrated possibly substantial power gains.
Non-Technical Research Use Statement:
We propose applying our newly developed statistical analysis methods, along with other suitable existing methods, to the existing ADSP sequencing data and other AD GWAS data to detect common or rare genetic variants associated with Alzheimer’s disease (AD). The novelty and power of our new methods are in two aspects: first, we consider and account for possible genetic heterogeneity with several subcategories of AD; second, we apply powerful meta-analysis methods to combine the association analyses across multiple subcategories of AD. The proposed research is feasible, promising and potentially signiﬁcant to AD research. In addition, our proposed analyses of the existing large amount of ADSP sequencing data and other AD GWAS data with our developed new methods are novel, powerful and cost-effective.
Investigator:
Pathak, Gita
Institution:
Institute for Genomic Health, Genetics and Genomic Sciences at Mount Sinai
Project Title:
Multi-modal analysis of psychiatric and dementia outcomes
Date of Approval:
June 15, 2026
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
a. Objectives of the Proposed Research This study aims to investigate the relationship between psychiatric traits and age-related cognitive decline, addressing a critical knowledge gap in understanding how mental health influences aging outcomes. b. Study Design The study employs a multi-level investigative approach combining epidemiological, genetic, and molecular methodologies. The design incorporates three complementary components: first, identification of phenotypic associations between psychiatric traits and MCI/AD through comprehensive clinical assessment; second, investigation of genetic architecture through analysis of coding and non-coding variants, genetic correlation assessments, polygenic scoring, and Mendelian randomization for causal inference; and third, examination of molecular mechanisms through genetically regulated epigenetic and proteomic processes. The study design enables stratified analyses by sex and ethnicity while controlling for demographic and lifestyle confounders, providing a comprehensive framework for understanding the psychiatric-cognitive decline relationship across multiple biological levels. c. Analytical Plan The analytical approach will proceed in sequential phases, beginning with statistical modeling to identify psychiatric traits significantly associated with MCI and AD outcomes while adjusting for demographic and lifestyle factors. Genetic analyses will employ polygenic risk scores and Mendelian randomization techniques to establish causal relationships between psychiatric conditions (particularly depression and alcohol use disorder) and cognitive outcomes. Molecular analyses will focus on identifying shared genetic loci between psychiatric and cognitive phenotypes, followed by investigation of genetically regulated methylation and proteomic markers as potential mediators. The analysis plan includes development of molecular weights to aid causal inference analyses and determination of effect directionality, with stratified results reported by sex and ethnicity to identify population-specific risk patterns and potential intervention targets.
Non-Technical Research Use Statement:
This research examines how mental health conditions like depression and anxiety may increase the risk of memory problems and Alzheimer's disease as people age. Using genetic data and biological markers, we'll study whether psychiatric conditions directly cause cognitive decline or if they share common underlying causes. The study will identify which mental health factors pose the greatest risk for dementia, particularly looking at differences between men and women and various ethnic groups. Results could help better predict and prevent cognitive decline by addressing mental health early in life, potentially improving outcomes for millions facing both psychiatric and age-related brain conditions.
Investigator:
Rose, Evan
Institution:
University of Chicago
Project Title:
Colorism
Date of Approval:
March 14, 2023
Request status:
Closed
Research use statements:
Show statements
Technical Research Use Statement:
Our study will investigate the impact of skin color-based discrimination (colorism) on socioeconomic and health outcomes. We will do so by measuring how genetic variants that increase melanin production are associated with surveyed outcomes in the Health and Retirement Survey (HRS), including employment, income, education, and medical history. The results will add quantitative evidence to the large body of qualitative and critical literature on colorism and enhance our understanding of how colorism contributes to structural inequality.To measure the causal impact of colorism on life outcomes, we will use genetic variant data from the Genotype Data Version 3 (2006-2012 Samples) available via NAGADS and outcomes from HRS survey questions, such as income and years of education. We will assemble a list of SNPs from prior studies that have been shown to cause darker skin, and study the impacts of these SNPs on individuals in the HRS. The specific SNPS of interest include rs16891982, rs1426654, and rs1800404, which according to correspondence with Amanda Kuzma are available in the Genotype Data (at least via HRC imputation). To estimate effects, we will fit regression models that relate life outcomes to the presence of SNPs while controlling for any confounding variables, including genetic principal components. If colorism leads to worse life outcomes, we would expect to see a negative slope between the effect size of the SNP and the predicted outcomes. Our analysis can be considered a version of Mendelian randomization (MR). We will perform the work in Python and R.Our study will bring new statistical evidence to a large body of work demonstrating that colorism is a widespread form of discrimination in America. Using Mendelian randomization with variants that modulate genetically predisposed skin color, we can directly isolate the causal effect of colorism on social inequalities. Our findings have the potential to support the lived experiences of people who experience skin color-based discrimination and improve public knowledge that colorism is an important contributor to inequality in our society, and one that we can better address through policy.
Non-Technical Research Use Statement:
Our study will investigate the impact of skin color-based discrimination (colorism) on socioeconomic and health outcomes. We will do so by measuring how genetic variants that increase melanin production are associated with surveyed outcomes in the Health and Retirement Survey (HRS), including employment, income, education, and medical history. The results will add quantitative evidence to the large body of qualitative and critical literature on colorism and enhance our understanding of how colorism contributes to structural inequality. Our findings have the potential to support the lived experiences of people who experience skin color-based discrimination and improve public knowledge that colorism is an important contributor to inequality in our society, and one that we can better address through policy.
Investigator:
Rosso, Andrea
Institution:
University of Pittsburgh
Project Title:
LEGENNDS
Date of Approval:
November 26, 2025
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
The primary outcome variable of interest is the presence of dementia as determined from the core HRS interview that includes a Telephone Interview for Cognitive Status (TICS). Secondary outcomes of interest include cognitive scores and change in scores. The primary predictor variables of interest are polygenic risk scores for autism, ADHD, and intelligence. The covariates considered in this study include age at study entry, sex, race/ethnicity, socioeconomic status at baseline, education, and genetic ancestry. Cox proportional hazards models will evaluate time to event, operationalized as the time from study entry to the time of the first instance of dementia status. Those who did not receive a status of dementia by the end of the follow-up period were termed censored. Similarly, individuals who died before receiving a diagnosis of dementia were also censored, as well as those who were lost to follow-up. Each censored individual also had a time to event from the time of study entry to the time of censoring. Statistical models will estimate the association of polygenic risk scores for autism, ADHD, and intelligence and dementia/cognitive scores and change in scores. This is a collaboration with Drexel University (PI: Brian K Lee)
Non-Technical Research Use Statement:
Emerging evidence suggests that certain neurodevelopmental disorders – autism, attention-deficit hyperactive disorder (ADHD), and intellectual disability – may increase the risk of later life neurodegenerative disorders such as Alzheimer’s disease or related dementia (ADRD). The goal of this proposed study is to elucidate the genetic link between autism, ADHD, and intellectual disability and ADRD. As part of this, we will examine the relationship between genetic susceptibility to neurodevelopmental disorders and future risk of ADRD. This study will make use of genetic data from over 18,000 participants in the Health and Retirement Study.
Investigator:
Roussos, Panagiotis
Institution:
Icahn School of Medicine at Mount Sinai
Project Title:
Higher Order Chromatin and Genetic Risk for Alzheimer's Disease
Date of Approval:
November 21, 2024
Request status:
Expired
Research use statements:
Show statements
Technical Research Use Statement:
Alzheimer's disease (AD) is the most common form of dementia and is characterized by cognitive impairment and progressive neurodegeneration. Genome-wide association studies of AD have identified more than 70 risk loci; however, a major challenge in the field is that the majority of these risk factors are harbored within non-coding regions where their impact on AD pathogenesis has been difficult to establish. Therefore, the molecular basis of AD development and progression remains elusive and, so far, reliable treatments have not been found. The overarching goal of this proposal is to examine and validate AD-related changes on chromatin accessibility and the 3D genome at the single cell level. Based on recent data from our group and others, we hypothesize that genotype-phenotype associations in AD are causally mediated by cell type-specific alterations in the regulatory mechanisms of gene expression. To test our hypothesis, we propose the following Specific Aims: (1) perform multimodal (i.e., within cell) profiling of the chromatin accessibility and transcriptome at the single cell level to identify cell type-specific AD-related changes on the 3D genome; (2) fine-map AD risk loci to identify causal variants, regulatory regions and genes; (3) functionally validate putative causal variants and regulatory sequences using novel approaches that combine massively parallel reporter assays, CRISPR and single cell assays in neurons and microglia derived from induced pluripotent stem cells; and (4) develop and maintain a community workspace that provides for the rapid dissemination and open evaluation of data, analyses, and outcomes. Overall, our multidisciplinary computational and experimental approach will provide a compendium of functionally and causally validated AD risk loci that has the potential to lead to new insights and avenues for therapeutic development.
Non-Technical Research Use Statement:
Alzheimer’s disease (AD) affects half the US population over the age of 85 and despite decades of research, reliable treatments for AD have not been found. The overarching goal of our proposal is to generate multiscale genomics (gene expression and epigenome regulation) data at the single cell level and perform fine mapping to detect and validate causal variants, transcripts and regulatory sequences in AD. The proposed work will bridge the gap in understanding the link among the effects of risk variants on enhancer activity and transcript expression, thus illuminating AD molecular mechanisms and providing new targets for future therapeutic development.
Investigator:
Safo, Sandra
Institution:
University of Minnesota
Project Title:
Innovative Machine and Deep Learning Analyses of Alzheimer's Disease Omics and Phenotypic Data
Date of Approval:
October 27, 2023
Request status:
Expired
Research use statements:
Show statements
Technical Research Use Statement:
AD is the most common cause of dementia and presents a substantial and increasing economic and social burden. Our ability to diagnose and classify AD from cognitive normals (CN), or discriminate among individuals with AD, early mild cognitive impairment [EMCI], or late mild cognitive impairment (LMCI), is essential for the prevention, diagnosis, and treatment of AD. Since individuals with MCI have a high chance of converting to AD, effectively discriminating between those who convert to AD (MCI-C) from those who do not convert (MCINC) is important for early diagnosis of AD. The heterogeneity of AD has motivated attempts to classify distinct subgroups of AD to better inform the underlying physiology. There is evidence to suggest that using data across multiple modalities (e.g. genetics, imaging, metabolomics) has potential to classify AD subgroups better than using single modality. We will apply machine and deep learning methods to gain deeper insight into AD and ADRD pathobiology. We will use datasets that include genomics, genetics, metabolomics, and phenotypic data for this purpose. Data will be divided into discovery and validation sets. On the discovery set, state-of-the-art ML and DL methods for integrative analysis that we and others have developed will be coupled with resampling techniques to determine candidate molecular signatures and pathways discriminating the AD groups considered. Molecular scores will be developed from these candidate biomarkers. The clinical utility of the scores beyond well-known clinical risk factors for AD will be ascertained. We will validate our findings using the validation data. We will visually and quantitatively compare the risk scores across several clinical variables and outcomes. We will use (un)supervised clustering methods to identify molecular clusters, and we will investigate molecular clusters differentiating MCI to AD converters from non-converters. We may explore differences across ethnic subgroups. We will also innovatively apply our multimodal molecular subtyping methods to discover, reproduce, and characterize novel molecular subgroups of AD– this will allow for better risk stratification.
Non-Technical Research Use Statement:
We have been developing novel machine learning (ML) and deep learning (DL) methods that leverage genomics, other omics (including proteomics and metabolomics), clinical and epidemiology data to better understand the pathogenesis of complex diseases. By integrating data from different sources, we have identified molecular signatures contributing to the risk of the development of complex diseases beyond established risk factors. We are proposing to innovatively apply these, and other existing, methods, to data pertaining to Alzheimer’s disease (AD) and Alzheimer’s disease related dementias (ADRD). A deeper understanding of the genes, genetic pathways, and other molecular signatures of AD is essential and could facilitate the identification of potential therapeutic targets for the disease.
Investigator:
Schwaba, Ted
Institution:
Michigan State University (MSU)
Project Title:
A Genomic Window Into Lifespan Personality Development
Date of Approval:
July 15, 2025
Request status:
Expired
Research use statements:
Show statements
Technical Research Use Statement:
Objectives: Across the lifespan, a person's genetics are associated with their personality traits. We will capitalize on recent advancements in personality genomics to test whether and how common genetic variants relevant to personality (i.e., polygenic indices (PGIs) for personality traits) are associated with trajectories of phenotypic personality development, both with age and in the context of life events. By linking genetics to personality through environment, we will substantiate mechanisms by which lifespan development occurs.Study design: Observational; we will use existing data (the Health and Retirement Survey; HRS) to estimate longitudinal correlations between genetic profiles (PGIs) and personality trait development over timeAnalysis plan: 1) Using genomic information from HRS participants, we will apply polygenic index weights (PGIs) from the forthcoming Revived Genomics of Personality Consortium (ReGPC) Genome-Wide Association Study (Schwaba et al., 2025) to assign each participant a PGI value for each of the Big Five personality traits (extraversion, agreeableness, conscientiousness, neuroticism, openness to experience). 2) Using multilevel models, we will estimate stable levels and change over time in the phenotypic Big Five personality traits (measured bi-yearly from 2006 onwards using the Midlife Development Inventory). Change will be operationalized as yearly development in the years before/after a stressful life event (measured with a life events questionnaire) and yearly age-graded development (measured in terms of change with age). 3) We will correlate PGI values with levels and change in the Big Five personality traits, to examine whether a person's personality-relevant genetics differentiate trajectories of development with age and in the context of life events
Non-Technical Research Use Statement:
Across the lifespan, a person's genetics are associated with their personality traits. We will test whether and how genetic variants relevant to personality are associated with trajectories of personality trait development, both with age and in the context of life events. By linking genetics to personality through environment, we will substantiate mechanisms through which lifespan development occurs.
Investigator:
Seshadri, Sudha
Institution:
Glenn Biggs Institute for Alzheimer's and Neurodegenerative Diseases, University of Texas Health Sciences Center, San Antonio, TX
Project Title:
Therapeutic target discovery in ADSP data via comprehensive whole-genome analysis incorporating ethnic diversity and systems approaches
Date of Approval:
August 12, 2025
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
Objective: Utilize ADSP data sets to identify genes & specific genetic variants that confer risk for or protection from Alzheimer disease. Aim 1: Using combined WGS/WES across the ADSP Discovery, Disc-Ext, and FUS Phases, including single nucleotide variants, small insertion/deletions, and structural variants. We will: Aim 1a. Perform whole genome single variant and rare variant case/control association analyses of AD using ADSP and other available data; Aim 1b. Target protective variant identification via association analysis using selected controls within the ADSP data and performing meta analysis across association results based on selected controls from non-ADSP data sets. Aim 1c. Perform endophenotype analyses including cognitive function measures, hippocampal volume and circulation beta-amyloid ADSP data in subjects for which these measures are available. Meta analysis will be conducted across ADSP and non-ADSP analysis results. Aim 2: To leverage ethnically-diverse and admixed populations to identify AD variants we will: Aim 2a. Estimate and account for global and local ancestry in all analyses; Aim 2b. Perform admixture mapping in samples of admixed ancestry; and Aim 2c. Perform ethnicity-specific and trans-ethnic meta-analyses. Aim 3: To identify putative therapeutic targets through functional characterization of genes and networks via bioinformatics, integrative ‘omics analyses. We will: Aim 3a. Annotate variants with their functional consequences using bioinformatic tools and publicly available “omics” data. Aim 3b. Prioritize results, group variants with shared function, and identify key genes functionally related to AD via weighted association analyses and network approaches. Analyses will be performed in coordination with the following PIs. Coordination will involve sharing expertise, analysis plans or analysis results. No individual level data will be shared across institutions. Philip De Jager, Columbia University; Eric Boerwinkle & Myriam Fornage, U of Texas Health Science Center, Houston; Sudha Seshadri, U of Texas, San Antonio; Ellen Wijsman, U of Washington. William Salerno, Baylor College of Medicine
Non-Technical Research Use Statement:
This proposal seeks to analyze existing genetic sequencing data generated as part of the Alzheimer’s Disease Sequencing Project (ADSP) including the ADSP Follow-up Study (FUS) with the goal of identifying genes and specific changes within those genes that either confer risk for Alzheimer’s Disease or provide protection from Alzheimer’s Disease. Analytic challenges include analysis of whole genome sequencing data, appropriately accounting for population structure across European ancestry, Hispanic, and African American participants, and interpreting results in the context of other genomic data available.
Investigator:
Singleton, Andrew
Institution:
National Institute on Aging
Project Title:
Genetic Characterization of Movement Disorders and Dementias
Date of Approval:
January 28, 2025
Request status:
Expired
Research use statements:
Show statements
Technical Research Use Statement:
The goal of this project is to utilize standard genetics tools and ensemble/deep learning methods to predict/classify the etiological aspects of Alzheimer's disease and other neurodegenerative diseases based on genetic data and genomic data (including individual level data e.g. genotype and sequencing data, transcriptomic, and epigenomics data, and also by the use of summary statistics). Our primary phenotypes of interest include case:control status, age at onset, survival time (in terms of disease duration from diagnosis to loss to follow-up) and related biomarker data, although there may be other phenotypes of interest that are derived later based on available data.
Non-Technical Research Use Statement:
We are attempting to identify and predict risk of Alzheimer's disease and other neurodegenerative diseases based on genetic and genomic data using standard tools and advanced machine-learning methods.
Investigator:
Thyagarajan, Bharat
Institution:
University of Minnesota
Project Title:
Omics-based Machine Learning Model to Predict AD dementia
Date of Approval:
January 22, 2024
Request status:
Expired
Research use statements:
Show statements
Technical Research Use Statement:
Alzheimer’s disease (AD) dementia is a heterogeneous neurodegenerative disease among older adults. Early detection of AD dementia remains challenging due to heterogeneity in disease onset and progression. Our goal is to develop a genetic variants-based VAE model to predict AD dementia. We will use GWAS data collected in Health and Retirement Study (HRS). HRS has genotyped saliva DNA samples collected since 2006 at multiple time points during field visits yielding a total of 19004 unique participants. We will use the most recent version of genotype data from 2006 to 2015 for those samples that have both epigenetic and transcriptomic data available. The genotyping was performed by NIH Center for Inherited Disease Research, using the Illumina HumanOmni2.5-4v1/8v1 array, and genotyping QC analysis was performed at the University of Michigan using HumanOmni2.5-4v1 H for SNP annotation. We will use the quality and minor allele frequency (MAF) filters specified in the HRS QC report for genotypic data to filter out poor quality SNPs. We will use cognition measures collected in HRS 2016 survey to classify participants into 'Dementia' and 'Normal' using the Langa Weir Classification algorithm.We will employ two main feature selection processes: 1. Based on the association with dementia, the top 50% of associated SNPs will be selected to input to the VAE model and filter out low-frequency SNPs. 2. We will also train a VAE model with a more comprehensive list of SNPs.We will employ the model regularization by incorporating biological knowledge as constraints in the model using the gene-gene interaction network from REACTOME/ STRING. We will also evaluate the biological interpretability of latent features that are representative of input genetic variants. We will evaluate the distribution of weights of all encoded features to select the positive high and negative high features based on 2 SD above or below the mean weight. These selected features will be input for the pathway analysis to identify pathways associated with AD. The candidate genes identified can be used to develop blood-based biomarkers for early identification of AD
Non-Technical Research Use Statement:
We will develop a genetic variants-based VAE model to predict dementia. We will employ various feature selection processes based on the complexity of data to evaluate the VAE model performance and to identify a representative list of genes. In addition, we will evaluate the biological interpretability of latent features obtained from the VAE encoder layer by extracting their decoder weights that captures the input feature contribution to the learned latent feature. This will also allow us to evaluate if the VAE model has learned novel features known to be associated with AD dementia. We will then utilize the learned features to identify the biological pathways associated with AD dementia.
Investigator:
Tucker-Drob, Elliot
Institution:
University of Texas at Austin
Project Title:
Genetics of Multisystem Aging
Date of Approval:
June 1, 2026
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
The present study aims to address how gene-environment interactions across lifestyle, socioeconomic and health-related factors predict levels and rates of declines across diverse domains of health, including multiple measures of cognitive function, functional ability, chronic diseases, and wellbeing. First, we will assess and characterize individual patterns of change across health domains. Second, we will search for common and specific sources of senescent change over time across health domains. Third, we will assess how genetic liabilities for medical traits and environmental exposures determine broad and specific sources of variation in intra- and inter-individual change across health domains. We will use phenotypic data from all the waves of the Health and Retirement Study (HRS) (1992–2024) covering the following domains: 1) demographic and socioeconomic variables, 2) lifestyle behaviors 3) cognitive function, 4) functional ability, 5) chronic diseases, and 6) wellbeing. We will use genetic data from the Health and Retirement Study (HRS) to compute polygenic risk scores (PRS) across socioeconomic and health domains, using Genomic Structural Equation Modelling (Genomic SEM) to integrate the most recent and large GWAS summary data available from published meta-analyses and biobanks. To analyze the data, we will fit univariate LGCM models for each individual indicator embedded into health domains 3 to 5. We will specify up to three latent factors: a) an intercept factor, representing levels, b) a linear change factor, and c) a quadratic change factor. Secondly, we will fit a series of multivariate LGCM models. We will start fitting an unconstrained associative LGCM to estimate correlations among the factors of levels and slopes across health indicators, identifying clusters of health indicators with shared underlying change processes. Thirdly, we will expand the multivariate LGCM models to include additive and non-additive exogenous covariates, covering genetic and environmental exposures and their interaction.
Non-Technical Research Use Statement:
The present study aims to uncover how health-related genetic factors and environmental exposures interact, shaping heterogeneous trajectories of health in old age. We investigate pathways of aging across diverse domains of health, including multiple measures of cognitive function, functional ability, chronic diseases, and well-being, identifying common and specific sources of senescent change across health-related outcomes. A better understanding of genetic and environmental determinants of health trajectories is key to promoting health and well-being in old age.
Investigator:
Wainberg, Michael
Institution:
Sinai Health System
Project Title:
Uncovering the causal genetic variants, genes and cell types underlying brain disorders
Date of Approval:
February 3, 2026
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
We propose a multifaceted approach to elucidate and interpret genetic risk factors for Alzheimer's disease. First, we propose to perform a whole-genome sequencing meta-analysis of the Alzheimer's Disease Sequencing Project with the UK Biobank and All of Us to associate rare coding and non-coding variants with Alzheimer's disease and related dementias. We will explore a variety of case definitions in the UK Biobank and All of Us, including those based on ICD codes from electronic medical records (inpatient, primary care and/or death), self-report of Alzheimer's disease or Alzheimer's disease and related dementias, and/or family history of Alzheimer's disease or Alzheimer's disease and related dementias. We will perform single-variant, coding-variant burden, and non-coding variant burden tests using the REGENIE genome-wide association study toolkit.Second, we propose to develop statistical and machine learning models that can effectively infer (“fine-map”) the causal gene(s), variant(s), and cell type(s) underlying each association we find, as well as associations from existing genome-wide association studies and other Alzheimer's- and aging-related cohorts found in NIAGADS. In particular, we propose to improve causal gene identification by incorporating knowledge of gene function as a complement to functional genomics. For instance, we plan to develop improved methods for inferring biological networks, particularly from single-cell data, and integrate these networks with the results of the non-coding associations from our first aim to fine-map causal genes. To fine-map causal variants and cell types, we plan to integrate the associations from our first aim with single-nucleus chromatin accessibility data from postmortem brain cohorts to simultaneously infer which variant(s) are causal for each discovered locus and which cell type(s) they act through.
Non-Technical Research Use Statement:
We have a comprehensive plan to understand and explain the genetic factors that contribute to Alzheimer's disease. Our approach involves two main steps.First, we'll analyze genetic information from large research databases to identify rare genetic changes associated with Alzheimer's and related memory disorders. We'll look at both specific changes in genes and other parts of the genetic code. We'll use data from different studies and combine them to get a clearer picture.Second, we'll create advanced computer models that can help us figure out which specific genes, genetic changes, and cell types are responsible for these associations. This will help us pinpoint the most important factors contributing to Alzheimer's disease. We'll also analyze data from previous studies to build a more complete understanding of these genetic links.
Investigator:
Ware, Erin
Institution:
University of Michigan
Project Title:
Alzheimer's disease polygenic scores and cognition
Date of Approval:
September 2, 2025
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
Our goal is to investigate the roles of cumulative genetic variation, disparity-related factors, and their interactions on late-onset Alzheimer’s disease (LOAD) and dementia phenotypes, extending precision public health to environmental susceptibility across ancestries. LOAD is the leading terminal form of dementia affecting a growing number of aging U.S. adults. As LOAD risk is disproportionately high among minorities, women, rural inhabitants, and people with lower education, disparities in LOAD risk represent a critical knowledge gap. Novel approaches characterizing the multifaceted etiology of LOAD disparities are needed to identify the genetic underpinnings, biological pathways, and potentially modifiable environmental factors that lead to sustained LOAD disparities. We propose whole genome estimations of polygenic risk of cognition, LOAD, and LOAD risk factors to be examined for their effect on dementia phenotypes among individuals >70, independently and in concert; potential interactions between PGS and factors with disparities in LOAD; and application of our methods in European and African ancestry groups (Fig. 1). AIM 1. Determine the cumulative genetic risk of LOAD by estimating the effect of cognitive polygenic scores on dementia phenotypes in individuals of European and African ancestry. AIM 2. Determine the association between polygenic scores for a) behavioral, b) physiological, and c) social/psychosocial domains and dementia phenotypes in individuals of European and African ancestry. We will consider Mendelian Randomization approaches for this aim. AIM 3. For the relationships between polygenic scores and dementia phenotypes (AIMS 1 and 2), test for effect modification by LOAD disparity-related factors (sex, educational attainment, urban/rural), in individuals of European and African ancestry.
Non-Technical Research Use Statement:
The overall purpose of this proposal is to establish the relevance of polygenic risk in susceptibility to dementia, particularly among groups at increased risk of disease, including women, minorities, rural inhabitants, and those with low educational attainment. Because an individual’s susceptibility to dementia is likely a combination of genetics and environmental risk factors, we will jointly test the effects of cumulative genetic risk and dementia risk factors in our analysis. The proposal provides an opportunity to identify a genetic etiologic component in vulnerable groups that could lead to mechanistic understanding or targeted interventions to substantially benefit public health in the US.
Investigator:
Wedow, Robbee
Institution:
Purdue University
Project Title:
Unpacking the Emergence of Dementia Etiology Across the Life Course
Date of Approval:
August 29, 2025
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
Moderate to severe impairments in cognitive functioning are a primary hallmark of Alzheimer's disease (AD) and Alzheimer's Disease Related Dementias (ADRD), a class of disorders affecting ~30% of the population by age 90. Currently, scientists hypothesize that the AD/ADRD disease process begins decades prior to the low functioning observed at the time of diagnosis. The ideal study design to gain insight into liability in prodromal and preclinical stages of AD/ADRD would involve collecting data on a wide range of measures from a large group of participants across the entirety of the life course. However, this data collection strategy includes major pragmatic barriers. Longitudinal study designs that might identify prospective risk factors of later life disease onset carry high participant and financial costs and take decades to produce conclusive results. Because of these limitations, much of the literature has been left to speculate in a piecemeal fashion on what characterizes the AD/ADRD prodromal period. However, research into these prodromal and preclinical periods holds significant promise for improving prevention and intervention efforts by identifying at-risk individuals and those who are at an earlier, and likely more intervenable, stage of disease.We will analyze the links between genotypes and phenotypes to investigate the onset times of risk factors for AD/ADRD across the life course using data from the HRS. Our insights will focus on pinpointing specific periods when these outcomes manifest as individuals age. We hypothesize that genetic data and structural equation modeling can help identify the specific times when Alzheimer's risk factors emerge as individuals age throughout their lives. Our team proposes to leverage the Genomic Structural Equation Modeling (Genomic SEM) framework to identify genetic risk pathways to AD/ADRD across the life course using existing data from large epidemiological studies that index different age ranges. Results from this study will add previously unseen levels of precision to our understanding of when genetic risk for AD/ADRD emerges across the life course and which specific risk factors index its onset.\
Non-Technical Research Use Statement:
We will analyze the links between genotypes and phenotypes to investigate the onset times of risk factors for Alzheimer's and Dementia (AD) across the life course using data from the Health and Retirement Study (HRS). Our insights will focus on pinpointing specific periods when these outcomes manifest as individuals age. We hypothesize that genetic data and structural equation modeling can help identify the specific times when Alzheimer's risk factors emerge as individuals age throughout their lives.
Investigator:
Wingo, Thomas
Institution:
University of California Davis
Project Title:
Identifying Alzheimer's Disease Genetic Risk Factors By Integrated Genomic and Proteomic Analysis
Date of Approval:
October 2, 2023
Request status:
Closed
Research use statements:
Show statements
Technical Research Use Statement:
We aim to uncover new genetic risk variants for Alzheimer’s disease (AD) by analysis of an integrated analysis of proteomics and genetic sequencing performed at Emory University. Results of these analyses will be used to weight analysis of whole-genome sequencing (WGS), whole-genome genotyping (WGG), and whole-exome sequencing (WES) data from dbGaP and ADSP. We plan to publish our findings, so they are shared with the scientific community.Outcomes that will be tested include: (1) clinical disease status, (2) pathologic characterization (e.g., measures of beta-amylodi, tau, etc.), and (3) cognitive decline. For sequencing data, we will perform joint calling from samples previously mapped by ADSP using PECaller using default settings. Variant annotation will be performed using Bystro and quality control will follow Wingo et al., 2017. For rare variants, we will use burden- and variance-based tests to estimate association between genetic variants and each outcome for every gene in the genome. External weights from proteomic analyses will be optionally used, as well as measures of genomic conservation for each site. For common variants, we plan to test for differences in allele frequencies using maximum likelihood tests. For all analyses, we plan to control for population structure deriving principal components from the underlying sequencing or genotyping data.
Non-Technical Research Use Statement:
Our aim is to identify genetic variants that are associated with Alzheimer's Disease (AD) either using genomic data (from dbGap or from Emory University) or brain protein sequencing data (from Emory University) as a starting point. Each center’s data will be analyzed separately, and we will determine whether the findings are consistent among the centers. Additionally, we will use protein data from brain or cerebrospinal fluid of individuals with or without AD to guide the analysis of the genomic data to identify genetic variants that influence AD risk. Our overarching aim is to use genetic discoveries to identify mechanisms of AD pathogenesis and creation of more meaningful models of the disease.
Investigator:
Wingo, Thomas
Institution:
University of California Davis
Project Title:
Identifying Alzheimer's Disease Genetic Risk Factors By Integrated Genomic and Proteomic Analysis
Date of Approval:
January 21, 2026
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
We aim to uncover new genetic risk variants for Alzheimer’s disease (AD), AD-related dementia (ADRD), and behavioral and psychiatric symptoms (BPS) associated with AD/ADRD. We expect to use whole-genome sequencing (WGS), whole-genome genotyping (WGG), and whole-exome sequencing (WES) data. Additionally, we will use the results of brain proteomic analysis to nominate genes and pathways for AD, ADRD, and dementia BPS. We plan to publish our findings to share them with the scientific community.Outcomes that will be tested include: (1) clinical disease status, (2) pathologic characterization (e.g., measures of beta-amyloid, tau, etc.), (3) cognitive decline, (4) BPSD, and (5) outcomes related to AD/ADRD severity. For sequencing data, we will extract raw sequencing reads from CRAM/BAM (or equivalent encrypted files) and re-map those to hg38 build of the human genome using PEMapper. Bascalling will be performed using PECaller using default settings. Variant annotation will use Bystro and quality control will follow approaches to assess completeness and account for ancestry as is customary in our lab. For rare variants, we will a variety of kernel-based approaches and for common variants, use standard statistical modeling. For all analyses, we plan to control for population structure deriving principal components from the underlying sequencing or genotyping data.
Non-Technical Research Use Statement:
Our aim is to identify genetic variants that are associated with Alzheimer's Disease (AD) to uncover new genetic associations. We will examine the role of important risk factors for AD (e.g., age and sex) in our analyses. Separately, we will perform integration of genetic findings for AD with information about how genetic variants influence or are associated with gene expression in the brain, cerebrospinal fluid, or blood to uncover new pathways of disease. Our overarching aim is to use genetic discoveries to identify mechanisms of AD pathogenesis to help nominate new treatment targets.
Investigator:
Zhao, Jinying
Institution:
University of Florida
Project Title:
Identifying novel biomarkers for human complex diseases using an integrated multi-omics approach
Date of Approval:
November 21, 2023
Request status:
Closed
Research use statements:
Show statements
Technical Research Use Statement:
GWAS, WES and WGS have identified many genes associated with Alzheimer’s Dementia (AD) and its related traits. However, the identified genes thus far collectively explain only a small proportion of disease heritability, suggesting that more genes remained to be identified. Moreover, there is a clear gender and ethnic disparity for AD susceptibility, but little research has been done to identify gender- and ethnic-specific variants associated with AD. Of the many challenges for deciphering AD pathology, lacking of efficient and power statistical methods for genetic association mapping and causal inference represents a major bottleneck. To tackle this challenge, we have developed a set of novel statistical and bioinformatics approaches for genetic association mapping and multi-omics causation inference in large-scale ethnicity-specific epidemiological studies. The goal of this project is to leverage the multi-omics and clinical data archived by the ADSP, ADNI, ADGC as well as other AD-related data repositories to identify novel genes and molecular markers for AD. Specifically, we will (1) validate our novel methods for identifying novel risk and protective genomic variants and multi-omics causal pathways of AD; (2) identify novel ethnicity- and gender-specific genes and molecular causal pathways of AD. We will share our results, statistical methods and computational software with the scientific community.
Non-Technical Research Use Statement:
Although many genes have been associated with Alzheimer’s Dementia (AD), these genes altogether explain only a small fraction of disease etiology, suggesting more genes remained to be identified. Of the many challenges for deciphering AD pathology, lacking of power statistical methods represents a major bottleneck. To tackle this challenge, we have developed a set of novel statistical and bioinformatics approaches for genetic association mapping and multi-omics causation inference in large-scale ethnicity-specific epidemiological studies. The goal of this project is to leverage the rich genetic and other omic data along with clinical data archived by the ADSP, ADNI, ADGC as well as other AD-related data repositories to identify novel genes and molecular markers for AD. Such results will enhance our understanding of AD pathogenesis and may also serve as biomarkers for early diagnosis and therapeutic targets.
Investigator:
Zhi, Degui
Institution:
University of Texas Health Science Center at Houston
Project Title:
Genetics of deep-learning-derived neuroimaging endophenotypes for Alzheimer's Disease
Date of Approval:
February 6, 2025
Request status:
Expired
Research use statements:
Show statements
Technical Research Use Statement:
Alzheimer’s disease (AD) affects 5.6 million Americans over the age of 65 and exacts tremendous and increasing demands on patients, caregivers, and healthcare resources. Our current understanding of the biology and pathophysiology of AD is still limited, hindering advances in the development of therapeutic and preventive strategies. Existing genetic studies of AD have some success but these explain only a fraction of the overall disease risk, suggesting opportunities for additional discoveries. The proposed project will leverage existing neuroimaging and genetic data resources from the UK Biobank, the Alzheimer’s Disease Sequencing Project (ADSP), the Alzheimer’s Disease Neuroimaging Initiative (ADNI), and the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium, and will be conducted by a multidisciplinary team of investigators. We will derive AD endophenotypes from neuroimaging data in the UK Biobank using deep learning (DL). We will identify novel genetic loci associated with DL-derived imaging endophenotypes and optimize the co-heritability of these endophenotypes with AD-related phenotypes using UK Biobank genetic data. We will leverage resources and collaborations with AD Consortia and the power of DL-derived neuroimaging endophenotypes to identify novel genes for Alzheimer’s Disease and AD-related traits. Also, we will develop DL-based neuroimaging harmonization and imputation methods and distribute implementation software to the research community. We expect to discover new genes relevant to AD which may leads to understanding of molecular basis of AD and potential new treatment.
Non-Technical Research Use Statement:
Alzheimer’s disease (AD) exacts a tremendous burden on patients, caregivers, and healthcare resources. Our current understanding of the biology of AD is still limited, hindering advances in the development of treatment and prevention. Existing genetic studies of AD have some success but more studies are needed. The proposed project will leverage existing neuroimaging and genetic data resources from the UK Biobank, the Alzheimer’s Disease Sequencing Project (ADSP) and other consortia and will be conducted by a multidisciplinary team of investigators. We will derive new AD relevant intermediate phenotypes from neuroimaging data using deep learning (DL), an AI approach. We will identify novel genetic loci associated with these phenotypes. Also, we will develop imaging harmonization and imputation methods and distribute implementation software to the research community. We expect to discover new genes relevant to AD which may leads to understanding of molecular basis of AD and potential new treatment.

Total number of subjects: 19,042

Female 11,011 57.8 %

Male 8,031 42.2 %

NG00119 – Health and Retirement Study Genotype Data 2006-2012

Overview

Description

Sample Summary per Data Type

Available Filesets

Data Dictionary Files

Sample Information

Data Releases

Related Studies

Cohorts

Phenotype Harmonization

Consent Levels

Acknowledgement

Acknowledgment statement for any data distributed by NIAGADS:

For investigators using any data from this dataset:

For investigators using Health and Retirement Study (sa000021) data:

Publications

Third-Party Access

Approved Users

Total number of subjects: 19,042