A computational framework to integrate high-throughput ‘-omics’ datasets for the identification of potential mechanistic links

Helle Krogh Pedersen, Sofia K. Forslund, Valborg Gudmundsdottir, Anders Østergaard Petersen, Falk Hildebrand, Tuulia Hyötyläinen, Trine Nielsen, Torben Hansen, Peer Bork, S. Dusko Ehrlich, Søren Brunak, Matej Oresic, Oluf Pedersen*, Henrik Bjørn Nielsen

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

55 Citations (Scopus)

Abstract

We recently presented a three-pronged association study that integrated human intestinal microbiome data derived from shotgun-based sequencing with untargeted serum metabolome data and measures of host physiology. Metabolome and microbiome data are high dimensional, posing a major challenge for data integration. Here, we present a step-by-step computational protocol that details and discusses the dimensionality-reduction techniques used and methods for subsequent integration and interpretation of such heterogeneous types of data. Dimensionality reduction was achieved through a combination of data normalization approaches, binning of co-abundant genes and metabolites, and integration of prior biological knowledge. The use of prior knowledge to overcome functional redundancy across microbiome species is one central advance of our method over available alternative approaches. Applying this framework, other investigators can integrate various ‘-omics’ readouts with variables of host physiology or any other phenotype of interest (e.g., connecting host and microbiome readouts to disease severity or treatment outcome in a clinical cohort) in a three-pronged association analysis to identify potential mechanistic links to be tested in experimental settings. Although we originally developed the framework for a human metabolome–microbiome study, it is generalizable to other organisms and environmental metagenomes, as well as to studies including other -omics domains such as transcriptomics and proteomics. The provided R code runs in ~1 h on a standard PC.

Original languageEnglish
Pages (from-to)2781-2800
Number of pages20
JournalNature Protocols
Volume13
Issue number12
DOIs
Publication statusPublished - 1 Dec 2018

Bibliographical note

Funding Information:
This research received funding from the European Community’s Seventh Framework Programme (FP7/2007–2013): MetaHIT, grant agreement HEALTH-F4-2007-201052 and MetaCardis, grant agreement HEALTH-2012-305312. The Department of Bio and Health Informatics, Technical University of Denmark, and the Novo Nordisk Foundation Center for Basic Metabolic Research have in addition received support from the Innovative Medicines Initiative Joint Undertaking under grant agreement no. 115317 (DIRECT), the resources of which are composed of financial contributions from the European Union’s Seventh Framework Programme (FP7/2007–2013) and EFPIA companies’ in kind contribution. The Novo Nordisk Foundation Center for Protein Research received funding from the Novo Nordisk Foundation (grant agreement NNF14CC0001). The Novo Nordisk Foundation Center for Basic Metabolic Research is an independent research center at the University of Copenhagen partially funded by an unrestricted donation from the Novo Nordisk Foundation (http://www.metabol.ku.dk). A.Ø.P. received funding from the Lundbeck Foundation (grant R218-2016-1367) and S.D.E. received funding from Agence Nationale de la Recherche MetaGenoPolis grant ‘Investissements d’avenir’ ANR-11-DPBS-0001.

Publisher Copyright:
© 2018, Springer Nature Limited.

Fingerprint

Dive into the research topics of 'A computational framework to integrate high-throughput ‘-omics’ datasets for the identification of potential mechanistic links'. Together they form a unique fingerprint.

Cite this