The summary statistics are available in the “Open Access Dataset” tab.
To access the full data, please log into DSS and submit an application.
Within the application, add this dataset (accession NG00105) in the “Choose a Dataset” section.
Once approved, you will be able to log in and access the data within the DARM portal.

Description

Microglia Genomic Atlas – MiGA (NG00105.v1):
Post-mortem brain samples were obtained from the Netherlands Brain Bank (NBB). The permission to collect human brain material was obtained from the Ethical Committee of the VU University Medical Center, Amsterdam, The Netherlands. Informed consent for autopsy, the use of brain tissue and accompanied clinical information for research purposes was obtained per donor ante-mortem. Neuropathological assessments have been performed by the NBB. Detailed information per donor, including tissue type, age, sex, postmortem interval, pH of cerebrospinal fluid, cause of death, diagnosis, use of medication and neuropathological information is provided.

DNA Genotyping
Samples were genotyped using the Illumina Infinium Global Screening Array (GSA). Genotype imputation was performed for those 90 donors through the Michigan Imputation Server v1.4.1 (Minimac 4) using the 1000 Genomes (Phase 3) v5 (GRCh37) European panel and Eagle v2.4 phasing in quality control and imputation mode with rsq filter set to 0.3. Following imputation, variants were lifted over to the GRCh38 reference to match the RNA-seq data using Picard liftoverVCF and the “b37ToHg38.over.chain.gz” liftover chain file.

RNA extraction and sequencing
RNA was isolated using RNeasy Mini kit (Qiagen) adding the DNase I optional step or as described in detail before (Melief J, et al., 2016). Library preparation was performed at Genewiz using the Ultra-low input system which uses Poly-A selection. SMART-Seq v4 Ultra Low Input RNA Kit was used for library construction using 100 ng of RNA. The libraries were sequenced as 150 bp on fragments with an average read depth of 29 million (ranging from 14-82M) read pairs on the Illumina HiSeq 2500.

RNA-seq data processing
RNA-seq data was processed using the RAPiD pipeline (Wang YC, et al., 2015). RAPiD aligns samples to the hg38 genome build using STAR (Dobin A, et al., 2013) using the GENCODE v30 transcriptome reference and calculates quality control metrics using Picard. RNA-seq quality control was performed applying three filters to remove samples: 1) samples with less than 10M reads aligned from STAR; 2) samples with more than 20% of the reads aligned to ribosomal regions; 3) samples with less than 10% of the reads mapping to coding regions; 4) samples from brain regions with fewer than 20 donors. Estimated transcript abundance was obtained using RSEM (Li B and Dewey CN, 2011) and transcripts were summed to the gene level with tximport (Love MI, et al., 2017). Genes with more than 1 read count per million (CPM) in 30% of the samples were kept for downstream analysis. Gene level read counts were normalized as transcripts per million mapped reads (TPM) to adjust for sequencing library size differences.

Quantitative Trait Loci mapping
To perform expression QTL (eQTL) mapping, we followed the latest pipeline created by the GTEX consortium (Aguet et al. 2019). We completed a separate normalization and filtering method to previous analyses. Gene expression matrices were created from the RSEM output using tximport (Love, Soneson, and Robinson 2017). Matrices were then converted to GCT format, TMM normalized, filtered for lowly expressed genes, removing any gene with less than 0.1 TPM in 20% of samples and at least 6 counts in 20% of samples. Each gene was then inverse-normal transformed across samples. After filtering, we tested a total of 18,430 genes. Then, PEER (Stegle et al. 2012) factors were calculated to estimate hidden confounders within our expression data. We created a combined covariate matrix that included the PEER factors and the first 4 genotyping ancestry MDS values as input to the analysis. We tested numbers of PEER factors from 0 to 20 and found that between 5 and 10 factors produced the largest number of eGenes in each region.
To test for cis-eQTLs, linear regression was performed using the tensorQTL (Taylor-Weiner et al. 2019) cis_nominal mode for each SNP-gene pair using a 1 megabase window within the transcription start site (TSS) of a gene. To test for association between gene expression and the top variant in cis we used tensorQTL cis permutation pass per gene with 1000 permutations. To identify eGenes, we performed q-value correction of the permutation P-values for the top association per gene (Storey 2003) at a threshold of 0.05.
We performed splicing quantitative trait loci (sQTL) analysis using the splice junction read counts generated by regtools (Feng et al. 2018). Junctions were clustered using Leafcutter (Li et al. 2018), specifying for each junction in a cluster a maximum length of 100kb. Following the GTEx pipeline, introns without read counts in at least 50% of samples or with fewer than 10 read counts in at least 10% of samples were removed. Introns with insufficient variability across samples were removed. Filtered counts were then quantile normalized using prepare_phenotype_table.py from Leafcutter, merged, and converted to BED format, using the coordinates from the middle of the intron cluster. We created a combined covariate matrix that included the PEER factors and the first 4 genotyping ancestry MDS values as input to the analysis. We mapped sQTLs with between 0 and 20 PEER factors as covariates in our QTL model and determined 5 to be optimal in MFG, STG and THA. 0 PEER factors were used for SVZ.

To test for cis sQTLs, linear regression was performed using the tensorQTL nominal pass for each SNP-junction pair using a 100kb window from the center of each intron cluster. Although junctions were initially grouped together into clusters, we tested each SNP-junction pair separately, which is the standard approach (Li et al. 2018; Aguet et al. 2019). To test for association between intronic ratio and the top variant in cis we used tensorQTL permutation pass, grouping junctions by their cluster using –grp option. To identify significant clusters, we performed q-value correction using a threshold of 0.05.

Microglia Genomic Atlas after Stimulation– MiGASTi (NG00105.v2):

Microglia Isolation
Brain tissue was stored in Hibernate media (Gibco) at 4 °C upon processing within 24 hours after autopsy. Microglia were isolated from six regions, including medial frontal gyrus (MFG; 43), superior frontal gyrus (SFG; 1 samples), superior temporal gyrus (STG; 34 samples), thalamus (THA; 24 samples), subventricular zone (SVZ; 32 samples) and corpus callosum (CC; 16 samples). In brief, brain tissue was first mechanically dissociated through a metal sieve in a glucose- potassium-sodium buffer (GKN-BSA; 8.0 g/L NaCl, 0.4 g/L KCl, 1.77 g/L Na2HPO4.2H2O, 0.69 g/L NaH2PO4.H2O, 2.0 g/L D-(1)-glucose, 0.3% bovine serum albumin (BSA, Merck, Darmstadt, Germany); pH 7.4) and supplemented with collagenase Type I (3700 units/mL; Worthington, USA) and DNase I (200 µg/mL; Roche, Switzerland) or 2% of Trypsin (Invitrogen) at 37 °C for 30 min or 60 min while shaking. The suspension was put over a 100 µM cell strainer and washed with GKN-BSA buffer in the centrifuge (1800 rpm, slow brake, 4 °C, 10 min) before the pellet was resuspended in 20 mL GKN-BSA buffer. 10 mL of Percoll (Merck, Darmstadt, Germany) was added dropwise and the tissue homogenate was centrifuged at 4000 rpm (fast acceleration, slow brake at 4 °C, 30 min). The middle layer was collected and washed with GKN-BSA buffer, followed by resuspension, and centrifuging in a magnetic-activated cell sorting (MACS) buffer (PBS, 1% heat-inactivated fetal cow serum (FCS), 2 mM EDTA; 1500 rpm, 10 °C, 10 min). Microglia were positively selected with CD11b-conjugated magnetic microbeads (Miltenyi Biotec, Germany) according to the manufacturer’s protocol. This protocol has been validated and the resulting cell viability was between 70 and 98%.

Culturing and stimulation of microglial cells
Microglia were cultured in a poly-L-lysine (PLL; Merck, Germany) coated 96-wells flat bottom plate (Greiner Bio-One, Austria) at a density of 1.0 × 105 cells in a total volume of 200 μL Rosswell-Park-Memorial-Institute medium (RPMI; Gibco Life Technologies, USA) supplemented with 10% FCS, 2 mM L-glutamine (Gibco Life Technologies, USA), 1% penicillin–streptomycin (Gibco Life Technologies, USA) and 100 ng/ml IL-34 (Miltenyi Biotech, Germany). After overnight incubation, pMG were stimulated with 100 ng/mL lipopolysaccharide (LPS) from Escherichia coli 0111:B4 (Merck, Germany), or with 50 ng/ml interferon gamma INF-γ (Peprotech, London, UK), Resiquimod (R848), 50 ng/ml tumor necrosis factor alpha (Sigma-Aldrich, The Netherlands), 40 ng/ml IL-4 (Peprotech, London, UK), 1000 nM dexamethasone (Sigma-Aldrich, The Netherlands), or 1 mM ATP (Sigma-Aldrich, The Netherlands) for 6 h. The concentrations and incubation times are for LPS and IFN-γ are based on dose-response curves and 6h showed most ‘robust’ inflammatory effects. We added the other stimulations as comparison. The cells were harvested with TRIzol reagent and stored at −80 °C for further analysis.

RNA extraction and sequencing protocol
Microglial RNA was isolated using the TRIzol method and cDNA libraries were generated by Genewiz (Azenta) using the Ultra-low input system which uses Poly-A selection. SMART-Seq v4 Ultra Low Input RNA Kit was used for library construction using 100 ng of RNA. The libraries were sequenced as 150 bp on fragments with an average read depth of 28 million (ranging from 0.06-128M) read pairs on the Illumina HiSeq 2500.

RNA-seq data processing
RNA-seq data was processed using the RAPiD pipeline. RAPiD aligns samples to the hg38 genome build using STAR using the GENCODE v30 transcriptome reference and calculates quality control metrics using Picard. RNA-seq quality control was performed applying four filters to remove samples: 1) samples with less than 1 M reads aligned from STAR; 2) samples with more than 20% of the reads aligned to ribosomal regions; 3) samples with less than 5% of the reads mapping to mRNA; 4) samples with high chance of RNA degradation based on gene coverage plot. Estimated transcript abundance was obtained using RSEM and transcripts were summed to the gene level with tximport. Genes with more than 1 read count per million (CPM) in 50% of the samples were kept for downstream analysis. Gene level read counts were normalized as transcripts per million mapped reads (TPM) to adjust for sequencing library size differences. We used several approaches to test the quality of the included samples: 1) the percentage of mitochondrial genes was only 0.87% on average 2) the expression of several apoptotic markers, such as CASP3, BTG1 was low across all stimulations 3) the expression levels of several well-known specific stimulation markers showed up- and downregulation patterns as expected.

Sample Summary per Data Type

Sample SetAccessionData TypeNumber of Samples
MiGA – Microglia Genomic Atlassnd100221000Genomes Imputed GWASn = 90
MiGA – Microglia Genomic Atlassnd10022Bulk RNA Sequencingn = 255
MiGASTi – Microglia Genomic Atlas after Stimulationsnd10082Short-Read RNA Sequencingn = 533

Available Filesets

NameAccessionLatest ReleaseDescription
MiGA – Microglia Genomic Atlas – GWAS Datafsa000008NG00105.v11000Genomes Imputed GWAS
MiGA – Microglia Genomic Atlas – QTL Summary Statistics (open access)fsa000009NG00105.v1QTL Summary Statistics
MiGA – Microglia Genomic Atlas – RNASeq Datafsa000010NG00105.v1RNASeq BAM files
MiGASTi – Microglia Genomic Atlas after Stimulation – Short-Read RNASeq Datafsa000105NG00105.v2Short-Read RNA FASTQ files

View the File Manifest for a full list of files released in this dataset.