The purpose of this study is to find new Alzheimer related variants and genes, by combining exome data from healthy controls and Alzheimer patients from different studies.

CNV calling was processed following a workflow centered on CANOES, a tool based on the
distribution of the depth of coverage information across WES samples. It includes a correction based on GC content of each target to reduce the background variability often observed in NGS data. The complete workflow is described in a previous paper (Quenez et al.) and is available at GitHub Link.

To summarize, we calculated the read depth for each sample on each target using Bedtools.
Then we regrouped samples based on genome build used for the alignment and by capture kit, and if it was available by study and/or sequencing batch. Each group of samples grouped in this way is called a “callingBatch”.

The next step was to remove uninformative regions, i.e. regions where more than 90% of the
callingBatch has less than 10 reads. Finally, for each callingBatch, we called the CNVs using CANOES, then we excluded samples if CANOES detected more than 50 CNVs (threshold suggested by the CANOES development team).