The African continent is home to 1.2 billion people, 17% of the world’s population, and a large proportion of human genetic diversity and yet African populations are relatively under-studied with only 3% of Genome-Wide Association Study (GWAS) samples coming from African ancestries (1), mostly African Americans.
A global opportunity exists to profile disease risk in African populations and deploy Polygenic Risk Scores (PRS) to focus healthcare resources on high-risk individuals. To get there requires much more than recruitment of study participants. Dr. Pedro Fernandez, PhD, is Hayes Chair of Research at the Faculty of Medicine and Health Sciences of Stellenbosch University in South Africa. He is also a Principal Investigator in the MADCaP Study and noted “Research capacity has to be developed in Africa to better understand the genetic landscape and environmental make-up specific to the populations we serve”.
Prostate cancer is an excellent example of the research opportunity in Africa: it is the leading cause of cancer-related deaths in African men and, globally, has higher prevalence and worse outcomes in men of African ancestry. To address this opportunity, the Men of African Descent and Carcinoma of the Prostate (MADCaP, www.madcapnetwork.org) Network was formed by investigators in the US, the UK, and Africa, led by Professor Timothy Rebbeck of Harvard TH Chan School of Public Health and the Dana-Farber Cancer Institute.
Polygenic Risk Scores for prostate cancer have been developed by European and Asian studies but they do not work as well in African populations for several reasons. For example, European and Asian studies miss many alleles that are present in African populations due to the much greater genetic diversity in Africa. Also, a risk locus in one population may confer no risk in another. “The MADCaP Network provides a unique opportunity to understand the African genome and its role in establishing prostate cancer risk in African men. It is critical that genetic association studies be undertaken in African populations to expand our understanding of genomic diversity in all populations. With this knowledge we can better understand and act on genetic susceptibility to many diseases including cancer”, Dr. Rebbeck told us.
With this in mind, the MADCaP team designed a GWAS strategy to genotype the maximum number of African variants across all 6,000 available samples within the budget available. Whole-genome NGS was ruled out due to cost, while whole-exome NGS offered insufficient coverage. This led the consortium to choose a genotyping microarray technology which offered high marker densities, high data quality plus the opportunity to customise the content to the study’s needs.
It’s useful to think about whole-genome genotyping arrays as a combination of content “modules” that, together, form the total set of variants genotyped by the array. An array should include content modules closely aligned to a study’s objectives, for example: genome-wide coverage of the target population; variants associated with specific diseases or traits of interest; functional variants; and operational content such as sample tracking markers. Sometimes, a study’s needs can be met by an existing genotyping array design. Otherwise, customising each content module to fully support study aims is often straightforward and cost-effective.
Dr. Joseph Lachance, of Georgia Institute of Technology, USA, is co-chair of the MADCaP Genomic Working Group. He noted “It was clear a new array was needed for our GWAS because existing arrays either provided good coverage of African genomic variation or cancer loci, but not both. We needed an array that supported two primary goals: firstly, to fine-map existing prostate cancer loci and, secondly, to map novel prostate cancer loci”. The consortium decided to partner with the design team at Thermo Fisher Scientific, leveraging its Applied Biosystems™ Axiom™ Genotyping Solution and the expertise and bioinformatics pipeline that have created many optimised array designs for large and small genotyping studies.
For the fine-mapping goal, the partnership designed content modules targeting specific genomic regions with genomic markers known to be present in African populations. Firstly, genomic regions adjacent to all known loci for prostate cancer or any other cancer type were saturated with markers. Next, a module was added to cover GWAS loci associated with any other trait. Content was also designed to tag variants known to be associated with RNA expression in the prostate.
To meet the second goal of mapping novel loci, a genome-wide content module was designed to impute common African population variation with a minor allele frequency of 5% or greater. Rarer alleles were excluded because the MADCaP GWAS of 6,000 samples has insufficient statistical power to detect associations to them. This decision also meant that the final array design could be optimised to cover almost all common variation found in Africa even for alleles that were common in only one target population. Ancestry Informative Markers were also added to study population structure. The MADCaP team has since published more detail of the array design (2).
A pilot of over 800 African samples has now been run on the Axiom MADCaP Array at the Center for Inherited Disease Research (CIDR), Johns Hopkins University, Baltimore in MA, USA. High call rates on over 1.5 million markers confirm that it works well in each of the study populations. Derived allele frequencies are similar for each continental population indicating that SNP selection was relatively unbiased. Already, clustering analysis has revealed insights into population structure and regions of homozygosity give clues to evolutionary history. The data also suggest that existing polygenic risk scores may be less effective in African populations (2).
The MADCaP investigators share common interests in studying health disparities, underserved populations, and the role that genetic and non-genetic factors play in both. Having led the way by developing the Axiom MADCaP array, they have now made the array available to anyone to order from Thermo Fisher Scientific or as a service at the Centre for Proteomic and Genomic Research (CPGR) in South Africa where the majority of the MADCaP samples are being genotyped. The intent is that interested research groups use the array or customize the design for their own studies to accelerate research progress in African populations.
In the meantime, the MADCaP Network has focused on genotyping its entire 6,000 sample resource at CPGR in South Africa, led by Dr Lindsay Petersen, PhD, Genomics Manager. This data will be pooled with 3,000 additional Pan-African samples to enable a total analysis of 9000 samples. The team is also developing plans to implement a multi-omic approach integrating gene expression and whole-genome tumor sequencing into the study.
- Popejoy AB & Fullerton SM (2016) Nature 538(7624):161-164. doi: 10.1038/538161a.
- Harlemon M (2019) Cancer Research (in press); also available at biorxiv.org/content/biorxiv/early/2019/07/15/702910.full.pdf
Have a question for our technical specialists? Contact us at thermofisher.com/genotyping-microarray-contact
For information on genotyping strategies adopted by other researchers, visit thermofisher.com/scientistspotlight
Learn more about Thermo Fisher’s Genotypic solutions at https://www.thermofisher.com/genotyping
For research use only. Not for use in diagnostic procedures.