Epigenetic analysis of cardio-metabolic health in an African population
INTRODUCTION AND AIM: Eighty-five percent of the 41 million annual non-communicable disease (NCD) mortalities worldwide occur in low- and middle-income countries (LMICs). A large proportion of these deaths are caused by cardio-metabolic diseases (CMDs). The prevalence of CMDs continues to increase in part owing to the rapid urbanisation experienced by these countries. Evidence has shown that epigenetic mechanisms, such as DNA methylation (DNAm), associate with CMDs and CMD risk factors. These mechanisms potentially mediate the relationship between genetic/environmental exposure (such as the behaviour and lifestyle changes related to urbanisation) and disease. Valuable insights have so far come from investigations of DNAm in the context of CMD through epigenome-wide association analyses (EWASs), white blood cell count (WBC) ratios and DNAm clocks. Although these investigations could be of great benefit to CMD prevention and treatment in LMICs, thus far data have largely been collected in individuals from European descent, mostly living in urbanised, high-income countries. Data on populations from different ancestries, living in LMICs, including continental Africans, are scarce. Because there are known genetic and epigenetic differences between ancestral groups, the generalisability of the current epigenetic literature, mostly resulting from European cohorts, to understudied African populations is unknown. This thesis reports the first investigation into the relationship between DNAm and cardio-metabolic health in black South Africans. First, the urban-rural divide, as is experienced in developing countries such as South Africa, was described as an epidemiological approach to investigate the role of DNAm in the association between urbanisation and NCD risk, in the form of a review. This formed part of the literature required to understand and interpret the experimental data. Empirically, DNAm was investigated using EWASs and analysis of methylation-derived WBC ratios and DNAm clocks, in relation to a range of CMD-related phenotypes including chronological and biological age, alcohol consumption, smoking status, body composition, biochemical indicators of metabolic health and inflammation, as well as markers of cardiovascular function (CVF) and risk. METHODS: A sub-sample of 120 apparently healthy Batswana men, aged 45 to 88 years, who participated in the 2015 arm of the Prospective Urban and Rural Epidemiology study in the North West province of South Africa (PURE-SA-NW) were investigated. Genome-wide DNAm data were generated from whole-blood DNA using the Illumina® Infinium HumanMethylationEPIC bead chip (EPIC array). Multiple CMD-related EWASs were performed and compared to previously published EWASs conducted in different ethnicities, to evaluate the reproducibility of current literature and to contribute novel findings from the PURE-SA-NW cohort. Next, methylation-derived WBC ratios were investigated and compared to protein-based inflammatory markers in their associations with CVF markers and their literature-based portrayal of CVD risk. Lastly, DNAm ages were estimated using five widely used DNAm clocks. Age estimates from the Horvath, Hannum and skin and blood clocks were compared in terms of their accuracy of chronological age estimation and those from PhenoAge and GrimAge clocks were compared according to their ability to characterise biophysiological decline. RESULTS: Up to 86% of previously identified epigenome-wide associations overlapped with the findings from the PURE-SA-NW study, and a further 13% were directionally consistent. Only 1% of the replicated associations presented with effects opposite to findings in other ancestral groups and were largely explained by population-specific genomic variance. Nineteen novel CpG associations with alcohol consumption (11 EPIC probes and eight 450K probes also present on the EPIC array) and one with high-density lipoprotein (450K probe) were observed. The WBC ratio estimates of the PURE-SA-NW group were comparable to previously investigated ostensibly healthy ethnic groups. The CVD risk portrayed by these markers was also similar to that of conventionally used risk markers, including C-reactive protein. The methylation-derived WBC ratio indicators performed better than the protein-based inflammatory markers when disentangling variance in CVF. Optimal clarification of CVF variance was obtained when the methylation-derived and protein-based markers were used in tandem. The skin and blood clock had a stronger correlation with chronological age and less variation in age acceleration compared to the Horvath and Hannum clocks. All three of these clocks, however, tended to underestimate the chronological age of the cohort. This underestimation was increasingly pronounced with older chronological age. GrimAge provided superior characterisation of biophysiological decline compared to the PhenoAge estimate, partly because of its incorporation of smoking-related effects, which were not encapsulated by the PhenoAge estimate or any of its constituents. This was of particular importance in this study population, given that more than half of them were current smokers. CONCLUSION: This thesis demonstrates that the methylation associations observed in this black South African population are largely in agreement with the epigenetic data published on other ethnicities, with some differences related to genomic variance, highlighting the need for population-specific data. The enhanced coverage of the EPIC array proved useful in expanding the current epigenetic literature. Methylation-derived WBC ratio markers provided additional value to conventionally used inflammatory markers in the elucidation of the role of inflammation in CVF, even in population-based research without overt inflammatory diseases. The DNAm clocks require further optimisation for their use in older populations, as was observed in their systematic underestimation of biological age in the PURE-SA-NW data. The fact that the GrimAge incorporates, for the first time, lifestyle-related exposure, such as smoking, seemed to add to its accuracy in characterising biophysiological decline. Empirically, this thesis shows that investigations of diverse populations are valuable and can reveal new associations. The critical narrative literature review highlights the need for epidemiological studies of DNAm across urban-rural divides where suitable data sets exist. Future studies can replicate the data reported here and further investigate causal pathways and utility in disease prediction.
- Health Sciences