A map of metabolic phenotypes in patients with myalgic encephalomyelitis/chronic fatigue syndrome

Myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) is a debilitating disease usually presenting after infection. Emerging evidence supports that energy metabolism is affected in ME/CFS, but a unifying metabolic phenotype has not been firmly established. We performed global metabolomics, lipidomics, and hormone measurements, and we used exploratory data analyses to compare serum from 83 patients with ME/CFS and 35 healthy controls. Some changes were common in the patient group, and these were compatible with effects of elevated energy strain and altered utilization of fatty acids and amino acids as catabolic fuels. In addition, a set of heterogeneous effects reflected specific changes in 3 subsets of patients, and 2 of these expressed characteristic contexts of deregulated energy metabolism. The biological relevance of these metabolic phenotypes (metabotypes) was supported by clinical data and independent blood analyses. In summary, we report a map of common and context-dependent metabolic changes in ME/CFS, and some of them presented possible associations with clinical patient profiles. We suggest that elevated energy strain may result from exertion-triggered tissue hypoxia and lead to systemic metabolic adaptation and compensation. Through various mechanisms, such metabolic dysfunction represents a likely mediator of key symptoms in ME/CFS and possibly a target for supportive intervention.

The data were preprocessed using conventional procedures, before we identified significantly different metabolites in ME/CFS patients compared to healthy subjects and clustered subjects with similar patterns of metabolite concentration shifts. We decided upon an exploratory strategy for our data analysis, and therefore applied t test statistics as filter method to select relevant features for multivariate statistics. This approach is known to be susceptible for overfitting the PCA results, so the results should be interpreted with caution. We addressed this potential bias by performing a PCA on all 610 metabolites within the dataset, which revealed that the proposed metabotypes are still highly relevant for explaining the observed variance within both sexes for the entire dataset (Section 2, below). We concluded that our EDA strategy returned statistically and biologically meaningful results, supported by patient data, lipidomics data and supplementary laboratory data. The three ME/CFS subsets with different metabotypes emerged when we focused on compounds of known identity (excluding molecules that were only partially characterized and xenobiotic substances). Although we cannot exclude some influence, the subsets were not explained by differences in age, sex, dietary supplements and medication (Sections 3-5, below). Neither did we find that fasting state influenced the ME/CFS subsets, supporting the use of non-fasting blood draws as recently recommended for metabolic profiling (3,4).
Previous broad-spectrum metabolomics studies primarily aimed to find potential biomarkers, and therefore identified metabolites with the highest statistical probability to predict ME/CFS (5-9). Since our aim was to gain broad insight into pathways that might be involved, we used a relatively low stringency level for statistical significance when comparing single molecule concentrations in ME/CFS patient and HC subjects (p<0.05). Such differences in study design, as well as differences in analytical technologies and cohort size and patient characteristics, represent plausible reasons why the statistically significant metabolites differ somewhat between the various studies. In addition, the presence of different metabolic phenotypes may explain why the effects vary more for some metabolites than others within the ME/CFS group, and when comparing different cohorts and studies. For example, our findings may indicate that BMI influences context-dependent metabolic adaptations. Notably, we found that the two main ME/CFS metabotype subsets sometimes expressed opposing effects, and this neutralized the potential effect on group level for some metabolites.

Possible impact of statistical method
Missing imputation method: We excluded metabolites with more than 25% missing values and applied the halfminimum (HM) method for imputation. To evaluate the possible skewed impact of missing value imputation, we performed Fischers and Chi-squared analyses between different groups (ME vs HC, sex, metabotype) (Supplemental Data Set 3). This indicated that around 2 -7% of the 610 metabolites had non-random distribution of missing values depending on the type of comparison being done. For the 159 significant variables from the univariate analyses, there was a similar percentage of metabolites displaying a non-random distribution of missing values. These results confirm that missing values imputations had negligible impact on the overall results in the multivariate analyses.
Filter method: The K-means algorithm that we applied for EDA purposes does not incorporate a feature selection method when applied outside a wrapper/embedded ML model, so a meaningful result depends on a feature selection step in advance. We used t-test statistics as filter method for our feature selection step, providing the 159 variables used for multivariate analyses. To evaluate if this cased overfitting of the results, we performed PCA on all 610 variables. The PCA on all 610 metabolites with an overlay of the proposed metabotypes (Figure 1 below) reproduced the influence of metabotypes in the total cohort, as well as in female and male separately. This supported that the choice of filter method returned adequate data, but the limitations of univariate feature selection for the purpose of clustering and identifying patient subtypes in high dimensional datasets is well known issue in the statistical community (10,11). The variable selection approach applied in this study could potentially exclude subtle discriminating features that display a stronger joint effect with other related variables, and there are also apparent drawbacks in identifying exact number of patient subtypes with conventional clustering methods as these methods could introduce a potential bias from the investigator, and techniques to identify an exact number of clusters display between-method variation in the estimation of k.
The statistical method was supported by biologically relevant results. The different metabolic profiles found in the ME/CFS patient subsets were validated by multiple independent measurements. Further, the findings were consistent with expected effects of relevant biological contexts, based on available literature.

Possible impact of sex:
The possible influence of sex on the clustering results was evaluated at several levels. First, there was no significant difference in sex composition between 1) the overall HC and ME/CFS groups (Fischer's exact test; Table 1 in the article), or 2) the HC group and the ME/CFS metabotype subsets (ME-M1, ME-M2 and ME-M3) (Chi squared test and Fischer's exact test; Table 1 in the article). Next, separate multivariate analyses using the 159 significant metabolites from the original univariate analysis confirmed that the K-means clustering patterns were largely maintained in both male and female subsets compared to mixed group (Figure 2 below). Further, the metabotypes represented different subsets in both female and male in separate PCA plots, when overlaying the original metabotype annotations (Figure 2 below). We also performed separate univariate comparison of the metabolomics data (Supplemental Data Set 4) and laboratory analyses (Supplemental Data 2) in female and male. Our conclusion is that the presented ME/CFS metabotypes are not driven by sex, and are adequately expressed in both female and male.

Possible impact of BMI
There was no significant difference in mean BMI between the HC and ME/CFS groups. However, mean BMI was significantly higher in the ME-M2 patients compared to the ME-M1 patients (25.7 vs 23.1) ( Table 1 in the article). In order to perform categorical testing we divided the patients with BMI < 25 (n=48) and BMI > 25 (n=35), and found that ME-M2 had significantly higher proportion of subjects with BMI > 25 compared to HC (Fischer's exact test, p<0.05). To evaluate the influence of BMI on the results of the multivariate analysis, Pearson correlation analysis was performed between BMI and the principal components of the PCA (Figure 3, below). There was a significant, yet weak, association between BMI and PC2 (r = 0.32, R 2 = 0.102, P < 0.05). To further investigate if the results were driven by high BMI, K-means clustering and PCA were repeated after exclusion of subjects with BMI > 25 (Figure 4, below). This was performed both on group level, and separately on female and male. The results were largely consistent with the findings in the total cohort, and when dividing by sex. Hence, although we cannot exclude the possibility that there is some influence of BMI, our evaluation concludes that BMI is unlikely to be a primary determinant of the observed patterns.

Possible impact of fasting, diet and drugs
Fasting: The 12 patients that performed overnight fasting before sampling distributed randomly between the three subsets ( Table 1 in the article). This supports that fasting state has minor influence on the proposed ME/CFS metabotypes.

Diett and drugs:
To evaluate if the suggested metabotypes associated with systematic differences in diet and medication, we took advantage of the data of 185 xenobiotic molecules that were available in the global metabolomics dataset (Supplemental Data Set 1, sheet 11). To assess the use of specific drug classes, we counted the subjects having detectable levels of the associated drug derivatives. If a drug had several metabolites, we used the respective metabolite with the highest occurrence (Drug Table 1, below). The drug metabolite data was also compared with the drugs documented in the patient journal (Drug Table 2, below). As expected, the findings generally agreed with larger consumption of supportive drugs in ME/CFS patients relative to the HC group. However, the data indicated no significant differences in drug use between the ME/CFS metabotype subsets regarding analgesics, gastro-esophagal-reflux drugs, allergy medication and anti-epileptics. Yet, it appears that our strategy based on drug metabolites may have overestimated the number of users of certain drugs. For instance, metoprolol (or its metabolites) was detected in 22.9% of the patients and 8.6% of the HC subjects, whereas the number of patients that had metoprolol documented in their medical files was only 3.6%. Further, paracetamol derivatives were detected in 84.4% and 84.2% for ME-m1 and ME-m2 respectively, and in 97.1% of the HC group. For all other analgesics there were proportionally fewer observations in the HC group. Zolpidem (a sedative/hypnotic drug used for anxiety/sleep) was detected in a larger proportion of ME-m2 patient (15.8%) compared to ME-m1 (3.1%), but this was not statistically significant due to few observations. Evaluation of xenobiotic compounds originating from foods revealed lower levels of metabolites related to peppers (such as piperin and 2-piperidinone) and caffeine in ME/CFS patients compared to HC subjects (Figure 4, below). There were also lower levels of a metabolite related to consumption of cereal and milk products (methyl glucopyranoside) in the patients. The general tendency of low dietary xenobiotics levels may agree with a low total dietary load, as would agree with limited caloric expenditure due to the debilitating condition. Of importance for the present study, the dietary xenobiotics patterns were similar among the ME-M1 and ME-M2 subsets, minimizing the probability that a systematic difference in xenobiotics causes the observed metabolic phenotypes.  Figure 4: Food components: Heatmap with fold change values relative to healthy controls of xenobiotics (excluding drug components) originating from food consumption. Significant differences relative to healthy controls are color coded with red (increased), blue (decreased), with light (p < 0.05) or dark (FDR < 0.05) shades, and gray (no significant change).