Computational Systems Biology of Pathogen-Host Interactions

COMPUTATIONAL SYSTEMS BIOLOGY OF PATHOGEN-HOST INTERACTIONS EDITED BY : Saliha Durmus ̧ , Tunahan Çakır and Reinhard Guthke PUBLISHED IN : Frontiers in Microbiology 1 May 2016 | C omputational Systems Biology of Pathogen-Host Interactions Frontiers in Microbiology Frontiers Copyright Statement © Copyright 2007-2016 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA (“Frontiers”) or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers. The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers’ website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply. Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission. Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book. As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials. All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use. ISSN 1664-8714 ISBN 978-2-88919-821-4 DOI 10.3389/978-2-88919-821-4 About Frontiers Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals. Frontiers Journal Series The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too. Dedication to quality Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world’s best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews. Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation. What are Frontiers Research Topics? Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org 2 May 2016 | C omputational Systems Biology of Pathogen-Host Interactions Frontiers in Microbiology COMPUTATIONAL SYSTEMS BIOLOGY OF PATHOGEN-HOST INTERACTIONS Topic Editors: Saliha Durmus ̧ , Gebze Technical University, Turkey Tunahan Çakır, Gebze Technical University, Turkey Reinhard Guthke, Leibniz Institute for Natural Product Research and Infection Biology - Hans-Knoell-Institute, Germany Visualization of the 3D cuboid environment of the agent-based model that corresponds to 1μl of the whole-blood infection assay, containing 5000 polymorphonuclear neutrophils, 500 monocytes, and 1000 Candida albicans cells. Image taken from: Lehnert T, Timme S, Pollmächer J, Hünniger K, Kurzai O and Figge MT (2015) Bottom-up modeling approach for the quantitative estimation of parameters in pathogen-host interactions. Front. Microbiol. 6:608. doi: 10.3389/fmicb.2015.00608 3 May 2016 | C omputational Systems Biology of Pathogen-Host Interactions Frontiers in Microbiology A thorough understanding of pathogenic microorganisms and their interactions with host organisms is crucial to prevent infectious threats due to the fact that Pathogen-Host Interactions (PHIs) have critical roles in initiating and sustaining infections. Therefore, the analysis of infection mechanisms through PHIs is indispensable to identify diagnostic biomarkers and next-generation drug targets and then to develop strategic novel solutions against drug-resistance and for personalized therapy. Traditional approaches are limited in capturing mechanisms of infection since they investigate hosts or pathogens individually. On the other hand, the systems biology approach focuses on the whole PHI system, and is more promising in capturing infection mechanisms. Here, we bring together studies on the below listed sections to present the current picture of the research on Computational Systems Biology of Pathogen-Host Interactions: - Computational Inference of PHI Networks using Omics Data - Computational Prediction of PHIs - Text Mining of PHI Data from the Literature - Mathematical Modeling and Bioinformatic Analysis of PHIs Computational Inference of PHI Networks using Omics Data Gene regulatory, metabolic and protein-protein networks of PHI systems are crucial for a thorough understanding of infection mechanisms. Great advances in molecular biology and biotechnology have allowed the production of related omics data experimentally. Many computational methods are emerging to infer molecular interaction networks of PHI systems from the corresponding omics data. Computational Prediction of PHIs Due to the lack of experimentally-found PHI data, many computational methods have been developed for the prediction of pathogen-host protein-protein interactions. Despite being emerging, currently available experimental PHI data are far from complete for a systems view of infection mechanisms through PHIs. Therefore, computational methods are the main tools to predict new PHIs. To this end, the development of new computational methods is of great interest. Text Mining of PHI Data from Literature Despite the recent development of many PHI-specific databases, most data relevant to PHIs are still buried in the biomedical literature, which demands for the use of text mining techniques to unravel PHIs hidden in the literature. Only some rare efforts have been performed to achieve this aim. Therefore, the development of novel text mining methods specific for PHI data retrieval is of key importance for efficient use of the available literature. Mathematical Modeling and Bioinformatic Analysis of PHIs After the reconstruction of PHI networks experimentally and/or computationally, their mathematical modeling and detailed computational analysis is required using bioinformatics tools to get insights on infection mechanisms. Bioinformatics methods are increasingly applied to analyze the increasing amount of experimentally-found and computationally-predicted PHI data. Acknowledgements We, editors of this e-book, acknowledge Emrah Nikerel (Yeditepe University, Turkey) and Arzucan Özgür (Bog ̆ aziçi University, Turkey) for their contributions during the initiation of the Research Topic. Citation: Durmus ̧, S., Çakır, T., Guthke, R., eds. (2016). Computational Systems Biology of Pathogen-Host Interactions. Lausanne: Frontiers Media. doi: 10.3389/978-2-88919-821-4 4 May 2016 | C omputational Systems Biology of Pathogen-Host Interactions Frontiers in Microbiology Table of Contents 06 Editorial: Computational Systems Biology of Pathogen-Host Interactions Saliha Durmus ̧ , Tunahan Çakır and Reinhard Guthke 09 A review on computational systems biology of pathogen–host interactions Saliha Durmus ̧ , Tunahan Çakır, Arzucan Özgür and Reinhard Guthke 28 Computational prediction of molecular pathogen-host interactions based on dual transcriptome data Sylvie Schulze, Sebastian G. Henkel, Dominik Driesch, Reinhard Guthke and Jörg Linde 39 Reconstruction of the temporal signaling network in Salmonella -infected human cells Gungor Budak, Oyku Eren Ozsoy, Yesim Aydin Son, Tolga Can and Nurcan Tuncbag 53 Computational approaches for prediction of pathogen-host protein-protein interactions Esmaeil Nourani, Farshad Khunjush and Saliha Durmus ̧ 63 Integrated inference and evaluation of host–fungi interaction networks Christian W. Remmele, Christian H. Luther, Johannes Balkenhol, Thomas Dandekar, Tobias Müller and Marcus T. Dittrich 81 Literature Mining and Ontology based Analysis of Host- Brucella Gene–Gene Interaction Network I ̇lknur Karadeniz, Junguk Hur, Yongqun He and Arzucan Özgür 91 Cell scale host-pathogen modeling: another branch in the evolution of constraint-based methods Neema Jamshidi and Anu Raghunathan 107 Host-pathogen interactions between the human innate immune system and Candida albicans —understanding and modeling defense and evasion strategies Sybille Dühring, Sebastian Germerodt, Christine Skerka, Peter F. Zipfel Thomas Dandekar and Stefan Schuster 125 Ebola virus infection modeling and identifiability problems Van Kinh Nguyen, Sebastian C. Binder, Alessandro Boianelli, Michael Meyer-Hermann and Esteban A. Hernandez-Vargas 136 Biomarker-based classification of bacterial and fungal whole-blood infections in a genome-wide expression study Andreas Dix, Kerstin Hünniger, Michael Weber, Reinhard Guthke, Oliver Kurzai and Jörg Linde 5 May 2016 | C omputational Systems Biology of Pathogen-Host Interactions Frontiers in Microbiology 147 Bioinformatic and mass spectrometry identification of Anaplasma phagocytophilum proteins translocated into host cell nuclei Sara H. G. Sinclair, Jose C. Garcia-Garcia and J. Stephen Dumler 157 Bottom-up modeling approach for the quantitative estimation of parameters in pathogen-host interactions Teresa Lehnert, Sandra Timme, Johannes Pollmächer, Kerstin Hünniger, Oliver Kurzai and Marc Thilo Figge 172 Deciphering chemokine properties by a hybrid agent-based model of Aspergillus fumigatus infection in human alveoli Johannes Pollmächer and Marc Thilo Figge 186 Automated quantification of the phagocytosis of Aspergillus fumigatus conidia by a novel image analysis algorithm Kaswara Kraibooj, Hanno Schoeler, Carl-Magnus Svensson, Axel A. Brakhage and Marc Thilo Figge EDITORIAL published: 04 February 2016 doi: 10.3389/fmicb.2016.00021 Frontiers in Microbiology | www.frontiersin.org February 2016 | Volume 7 | Article 21 | Edited by: Rustam Aminov, Technical University of Denmark, Denmark Reviewed by: Chuang Ma, Northwest Agricultural and Forestry University, China *Correspondence: Saliha Durmu ̧ s salihadurmus@gtu.edu.tr Specialty section: This article was submitted to Infectious Diseases, a section of the journal Frontiers in Microbiology Received: 10 December 2015 Accepted: 11 January 2016 Published: 04 February 2016 Citation: Durmu ̧ s S, Çakır T and Guthke R (2016) Editorial: Computational Systems Biology of Pathogen-Host Interactions. Front. Microbiol. 7:21. doi: 10.3389/fmicb.2016.00021 Editorial: Computational Systems Biology of Pathogen-Host Interactions Saliha Durmu ̧ s 1 *, Tunahan Çakır 1 and Reinhard Guthke 2 1 Computational Systems Biology Group, Department of Bioengineering, Gebze Technical University, Kocaeli, Turkey, 2 Leibniz Institute for Natural Product Research and Infection Biology - Hans Knöll Institute, Jena, Germany Keywords: pathogen-host interaction, computational systems biology, bioinformatics, omics data, network inference, text mining, constraint-based modeling, image-based systems biology The Editorial on the Research Topic Computational Systems Biology of Pathogen-Host Interactions Pathogen-Host Interactions (PHIs) play a significant role in the mechanisms of infections. Therefore, the investigation of infection mechanisms through PHIs is a crucial step to develop novel and more effective solutions against drug-resistance and for personalized therapy. To this aim, systems biology approach considers the whole PHI system instead of focusing hosts or pathogens individually. Computational modeling and analysis has a vital place within the whole systems biology workflow (Cyclic operation of experimental and modeling work). Multi-scale modeling provides the holistic view needed in the investigation of pathogen-host molecular interactions. However, it is usually very difficult to identify the model structure and parameters for complex multi-scale models. On the other hand, focused modeling types require more stringent and advanced feature selection approaches. This research topic aims to provide examples from the current picture of the research on computational systems biology of PHIs. The papers included here review recent studies or present original research on computational inference of PHI networks, computational prediction of PHIs, text mining of PHI data from the literature, and mathematical modeling and computational analysis of PHI networks. This research topic presents three review papers, 10 original research articles, and one technology report. Opening this research topic, we provide a comprehensive review of the studies on computational systems biology of PHIs (Durmu ̧ s et al.). We focus on the computational methods for the inference of molecular interaction networks of PHI systems, bioinformatic analysis of PHI networks, the Web-based PHI databases, and text-mining efforts to extract PHI data hidden in the literature. In this sense, this review provides a systems perspective on which the other articles covered in this research topic are based. PHI NETWORK INFERENCE USING OMICS DATA Schulze et al. deal with the challenge of the inference of inter-species gene regulatory networks from dual transcriptomic data. They use an extended version of NetGenerator, an ordinary differential equations (ODEs)-based tool for network inference that predicts gene regulatory networks from gene expression time series data (Guthke et al., 2005; Tierney et al., 2012). 6 Durmu ̧ s et al. Computational Systems Biology of Pathogen-Host Interactions Budak et al. use a temporal phosphoproteomic dataset of Salmonella-infected human cells (Rogers et al., 2011) to reconstruct the temporal signaling network of the human host by integrating protein-protein interaction (PPI) and the phosphoproteomic data. The Prize-collecting Steiner Forest approach and the Integer Linear Programming based edge inference approach are employed. The complementary use of both methods leads to a network which conserves the information about temporality, direction of interactions, while revealing the hidden entities in the signaling. COMPUTATIONAL PREDICTION OF PHIs Despite the recent advances, the experimentally-found PHI data are still scarce and the computational prediction is a valuable source of PHI data currently. The computational prediction primarily exploits sequence information, protein structure and known interactions. Machine learning techniques are used when there are sufficient known interactions available to be used as training data. On the opposite case, transfer and multitask learning methods are preferred. Nourani et al. provide an overview of these approaches for predicting PHIs. Experimentally verified data on fungi-host interactions are rare in the literature and in the PHI databases. Remmele et al. reconstruct large-scale PHI networks for the fungal pathogens Aspergillus fumigatus and Candida albicans and their human and mouse hosts. A computational PHI prediction method based on protein orthology, PPI data as well as data on gene functions and cellular localization was developed and used. TEXT MINING OF PHI DATA The emergence of large-scale experimental PHI data has led to the development of PHI databases such as VirusMentha (Calderone et al., 2015), VirhostNet (Guirimand et al., 2015), PATRIC (Wattam et al., 2014), HPIDB (Kumar and Nanduri, 2010), and PHISTO (Durmu ̧ s Tekir et al., 2013). Nevertheless, most data regarding PHIs are still buried in the articles and they have not been stored in databases. Karadeniz et al. extend text mining tool SciMiner, originally developed for extracting intra- species molecular interactions, for inter-species PHIs. They use SciMiner to extract host- Brucella gene-gene interactions, which are further analyzed by ontology modeling. MATHEMATICAL MODELING AND BIOINFORMATIC ANALYSIS OF PHIs Few examples of constraint-based PHI models are currently available in the literature. However, there is a lack of definite description of the methodology required for the functional integration of genome scale metabolic models in order to generate PHI models. Jamshidi and Raghunathan outline a systematic procedure to produce functional PHI models, highlighting steps which require debugging and iterative revisions in order to successfully build a functional model. The construction of such models will enable the exploration of PHIs by leveraging the growing wealth of omics data in order to better understand mechanisms of infection and identify novel therapeutic strategies. Dühring et al. describe the cross-talk between the fungal pathogen C. albicans and the human innate immune system. They review computational systems biology approaches to model and investigate these complex interactions with a special focus on fungal immune evasion and game-theoretical and agent-based models. Nguyen et al. use ODEs to represent the basic interactions between Ebola virus and wild-type Vero cells, i.e., epithelial cells of green monkeys, in vitro . The parameters in viral kinetics are estimated leading to a first mathematical model for Ebola virus infection. Dix et al. examine the transcriptional footprint of the host in response to the bacterial pathogens Staphylococcus aureus and Escherichia coli and the fungal pathogens C. albicans and A. fumigatus in a human whole-blood model. Expression data are exploited to build a random forest classifier to classify if a sample contains a bacterial, fungal or mock-infection. Sinclair et al. develop a method combining in silico prediction of bacterial nucleomodulins, i.e., proteins targeted to the host cell nucleus, and iTRAQ protein profiling (a mass spectrometric technique where two protein expression profiles are compared) to identify potential bacterial-derived nuclear-translocated proteins that could impact transcriptional programming in host cells. This approach was applied to intracellular bacteria such as Anaplasma phagocytophilum , Mycobacterium tuberculosis , and Chlamydia trachomatis. Finally, the research topic includes articles focusing on image-based systems biology of PHIs. While advances in omics techniques drive the progress of system biology on molecular level, there is also a significant progress on the cellular level based spatio-temporal data, e.g., microscopy images. Lehnert et al. apply non-spatial state-based modeling and agent-based modeling approaches to simulate an experimental assay for C. albicans infection of human blood. They predict cell migration parameters in 3D space where monocytes, granulocytes, and C. albicans cells are treated as migrating and interacting agents. Pollmächer and Figge implement a hybrid agent-based spatio-temporal modeling approach for A. fumigatus infection in human alveoli to decipher chemokine properties. They found by model simulations that the ratio of chemokine secretion rate to the diffusion coefficient is the main indicator for the success of pathogen detection by alveolar macrophages. Kraibooj et al. suggest a novel image analysis algorithm for the automated quantification of the phagocytosis of two wild type A. fumigatus strains. The strains were compared in terms of the phagocytosis process when the fungal conidia interact with alveolar macrophages. The computational modeling of PHI networks of interacting genes, transcripts, proteins, and metabolites is crucial to enlighten the molecular mechanisms of infection. The experimental detection of levels of biomolecules via omics approaches as well as the detection of PHIs via high-throughput experiments started to generate comprehensive datasets. The modeling of the large-scale data will not only elucidate the Frontiers in Microbiology | www.frontiersin.org February 2016 | Volume 7 | Article 21 | 7 Durmu ̧ s et al. Computational Systems Biology of Pathogen-Host Interactions mechanisms of infection, but will help in the discovery of biomarkers for novel diagnostic tools and of therapeutic drug targets through identification of essential molecules for the pathogen. Despite the recent efforts, the use of systems biology approaches to investigate PHI systems is still in its infancy, mostly because of data scarcity (Durmu ̧ s et al.). Ongoing studies in the field will certainly produce more large-scale PHI data in the near future. Heterogeneous data sets (clinical, microbiological, chemical, molecular on different levels such as SNPs, transcriptome, proteome, FACS, microscopic, mass spectrometric, etc.) will be integrated. More complete PHI models will allow the integration of omics-based and image- based systems biology of infection and will pioneer more complex multi-scale models with different scale in space (from molecules/cells/tissues to organism/population) and time (from seconds to month). These more complex models will improve the PHI-based solutions to infectious diseases. AUTHOR CONTRIBUTIONS SD conceived the content and drafted the manuscript; TC and RG conceived the content and revised the manuscript. ACKNOWLEDGMENTS TC was supported by the Turkish Academy of Sciences - Outstanding Young Scientists Award Program (TÜBA-GEB ̇ IP). RG was supported by the Deutsche Forschungsgemeinschaft (DFG) in the Collaborative Research Centre/Transregio 124 FungiNet (subprojects B3 and INF). REFERENCES Calderone, A., Licata, L., and Cesareni, G. (2015). VirusMentha: a new resource for virus-host protein interactions. Nucleic Acids Res. 43, D588–D592. doi: 10.1093/nar/gku830 Durmu ̧ s Tekir, S., Çakir, T., Ardiç E., Sayilirba ̧ s, A. S., Konuk, G., Konuk, M., et al. (2013). PHISTO: pathogen–host interaction search tool. Bioinformatics 29, 1357–1358. doi: 10.1093/bioinformatics/btt137 Guirimand, T., Delmotte, S., and Navratil, V. (2015). VirHostNet 2.0: surfing on the web of virus/host molecular interactions data. Nucleic Acids Res. 43, D583–D587. doi: 10.1093/nar/gku1121 Guthke, R., Möller, U., Hoffmann, M., Thies, F., and Töpfer, S. (2005). Dynamic network reconstruction from gene expression data applied to immune response during bacterial infection. Bioinformatics 21, 1626–1634. doi: 10.1093/bioinformatics/bti226 Kumar, R., and Nanduri, B. (2010). HPIDB–a unified resource for host-pathogen interactions. BMC Bioinform. 11(Suppl. 6): S16. doi: 10.1186/1471-2105-11- S6-S16 Rogers, L. D., Brown, N. F., Fang, Y., Pelech, S., and Foster, L. J. (2011). Phosphoproteomic analysis of Salmonella-infected cells identifies key kinase regulators and SopB-dependent host phosphorylation events. Sci. Signal. 4, rs9. doi: 10.1126/scisignal.2001668 Tierney, L., Linde, J., Müller, S., Brunke, S., Molina, J. C., Hube, B., et al. (2012). An interspecies regulatory network inferred from simultaneous RNA-seq of Candida albicans invading innate immune cells. Front. Microbiol. 3:85. doi: 10.3389/fmicb.2012.00085 Wattam, A. R., Abraham, D., Dalay, O., Disz, T. L., Driscoll, T., Gabbard, J. L., et al. (2014). PATRIC, the bacterial bioinformatics database and analysis resource. Nucleic Acids Res. 42, D581–D591. doi: 10.1093/nar/gkt1099 Conflict of Interest Statement: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. Copyright © 2016 Durmu ̧ s, Çakır and Guthke. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. Frontiers in Microbiology | www.frontiersin.org February 2016 | Volume 7 | Article 21 | 8 REVIEW published: 09 April 2015 doi: 10.3389/fmicb.2015.00235 Edited by: Anna Norrby-Teglund, Karolinska Institutet, Sweden Reviewed by: Marcio Luis Acencio, São Paulo State University, Brazil Peter Schaap, Wageningen University, Netherlands *Correspondence: Saliha Durmu ̧ s, Computational Systems Biology Group, Department of Bioengineering, Gebze Technical University, 41400 Gebze-Kocaeli, Turkey salihadurmus@gtu.edu.tr Specialty section: This article was submitted to Infectious Diseases, a section of the journal Frontiers in Microbiology Received: 01 December 2014 Accepted: 10 March 2015 Published: 09 April 2015 Citation: Durmu ̧ s S, ̇akır T, Özgür A and Guthke R (2015) A review on computational systems biology of pathogen–host interactions. Front. Microbiol. 6:235. doi: 10.3389/fmicb.2015.00235 A review on computational systems biology of pathogen–host interactions Saliha Durmu ̧ s 1 * , Tunahan ̇akır 1 , Arzucan Özgür 2 and Reinhard Guthke 3 1 Computational Systems Biology Group, Department of Bioengineering, Gebze Technical University, Kocaeli, Turkey, 2 Department of Computer Engineering, Bo ˇ gaziçi University, Istanbul, Turkey, 3 Leibniz Institute for Natural Product Research and Infection Biology – Hans-Knoell-Institute, Jena, Germany Pathogens manipulate the cellular mechanisms of host organisms via pathogen– host interactions (PHIs) in order to take advantage of the capabilities of host cells, leading to infections. The crucial role of these interspecies molecular interactions in initiating and sustaining infections necessitates a thorough understanding of the corresponding mechanisms. Unlike the traditional approach of considering the host or pathogen separately, a systems-level approach, considering the PHI system as a whole is indispensable to elucidate the mechanisms of infection. Following the technological advances in the post-genomic era, PHI data have been produced in large-scale within the last decade. Systems biology-based methods for the inference and analysis of PHI regulatory, metabolic, and protein–protein networks to shed light on infection mechanisms are gaining increasing demand thanks to the availability of omics data. The knowledge derived from the PHIs may largely contribute to the identification of new and more efficient therapeutics to prevent or cure infections. There are recent efforts for the detailed documentation of these experimentally verified PHI data through Web-based databases. Despite these advances in data archiving, there are still large amounts of PHI data in the biomedical literature yet to be discovered, and novel text mining methods are in development to unearth such hidden data. Here, we review a collection of recent studies on computational systems biology of PHIs with a special focus on the methods for the inference and analysis of PHI networks, covering also the Web-based databases and text-mining efforts to unravel the data hidden in the literature. Keywords: pathogen–host interaction, computational systems biology, bioinformatics, omics data, protein– protein interaction, metabolic interaction, gene regulatory network, drug target Introduction Infectious diseases are one of the preliminary causes of death worldwide each year. Emerging and reemerging diseases and drug resistant pathogens have made the problem more serious for human beings. Therefore, novel therapeutic strategies, called theranostics, are increasingly investigated to fight the biological threats. These strategic solutions require a systems biolog- ical approach with a thorough understanding of the underlying mechanisms of infections by focusing on molecular interactions between pathogenic and host organisms (Morens et al., 2004; Frontiers in Microbiology | www.frontiersin.org April 2015 | Volume 6 | Article 235 | 9 Durmu ̧ s et al. Computational systems biology of PHIs Murali et al., 2011; Guthke et al., 2012; Durmu ̧ s Tekir and Ülgen, 2013). Systems biology is an interdisciplinary research field in life sciences focusing on the study of non-linear interactions among biology entities through the integration and combina- tion of biomolecular and medical sciences with mathematical, computational, and engineering disciplines (Kitano, 2002). By modeling biological phenomena, systems biology uses a more holistic approach based on omics data instead of the traditional reductionism focusing at only a few molecules and interactions. The pathogen–host interactions (PHIs) may be between pro- teins, nucleotide sequences, metabolites, and small ligands. The protein–protein interactions (PPIs) have been identified as the most important type in the functioning of PHI systems and there- fore are the most studied type (Stebbins, 2005; Korkin et al., 2011; Zoraghi and Reiner, 2013). However, non-coding RNAs (ncRNAs) and metabolites have also been reported to have criti- cal functional roles in virus–host and bacteria–host interactions, respectively (Gottwein and Cullen, 2008; Skalsky and Cullen, 2010; Eisenreich et al., 2013; Saayman et al., 2014). Different levels of omics data collected from pathogens and/or infected cells are crucial components that drive bioinformatic analyses facilitating the construction and analysis of infection- specific gene-regulatory, metabolic, and protein–protein net- works (Westermann et al., 2012; Schulze et al., 2015). Such network-based computational systems biology analyses of PHI- based omics data enable the elucidation of infection mechanisms and their dynamics, the identification of potential drug targets for the next-generation antimicrobial therapeutics, and the devel- opment of novel and personalized strategies for the prevention and treatment of infections. With an increasing amount of exper- imental PHI data, Web-based databases were developed to derive and provide pathogen–host interactome data, usually focusing on specific pathogens or hosts (Wattam et al., 2014; Ako-Adjei et al., 2015; Calderone et al., 2015; Guirimand et al., 2015). Although the available databases are promising in data archiving, a huge amount of PHI data is not stored in any of these databases, since these data are buried in the literature. Therefore, there is an urgent need for novel text mining methods specific for PHI data retrieval. In this paper, the efforts on the collection of PHI-based omics data are reviewed first. Next, a review of the computational systems biology analyses of three major types of PHI networks is provided. Then, the available PHI databases and the current snapshot of the literature on text mining for PHI data are presented. Omics Data Reflecting PHI Networks The systems biology approaches with genome-wide molecular profiling using high-throughput techniques to generate omics data are changing the face of infection biology together with the computational methods for heterogeneous data management and integrative analysis via mathematical modeling (Guthke et al., 2012; Law et al., 2013). New insights in the microbial and viral pathogenesis, in particular in the host’s immune response to con- tact with pathogens, offer opportunities for better diagnostics, therapeutics, and vaccines. Thus, systems biology of infection allows to yield novel therapeutic targets (Sarker et al., 2013) and to establish individualized or personalized medicine. The integrative personal omics profile (iPOP) combines genomics, transcriptomics, proteomics, metabolomics, and autoantibody profiles from a single individual over a 14-month period (Chen et al., 2012; Li-Pook-Than and Snyder, 2013). There are various platforms for handling of measured data from samples, data storage and exchange, data pre-processing and data analysis. Powerful platforms for data management in systems biology have recently become available and are stan- dardized step by step by the Functional Genomics Data Society 1 (FGED, founded in 1999 as MGED; Brazma et al., 2006). Several systems biology projects in Europe including the ones dedicated to PHI research use the SysMO-DB/SEEK system for sharing data, knowledge (including Standard Operating Procedures – SOPs) and mathematical models 2 (Wolstencroft et al., 2011). For the management of genomics, transcriptomics, and (2D- gel) proteomics data in infection research, the data warehouse ‘OmniFung’ was established to support research on fungi–host interactions 3 (Albrecht et al., 2011, 2007). The free, open source and open development software project Bioconductor, which is primarily based on the statistical R pro- gramming language, provides 934 software packages, 894 annota- tion and 224 experimental data sets for the bioinformatic analysis and comprehension of high-throughput genomic data 4 (Version 3.0). These packages as well as other R packages not included in the Bioconductor project are useful for the advanced, in particu- lar integrative, analysis of omics data and modeling of PHIs. To identify genes, proteins or metabolites of interest for biomarker discovery or drug target prediction by supervised machine learn- ing methods, there are many data mining tools available. For instance, WEKA 5 or RapidMiner 6 is used to characterize the response of the host immune system by decision tree analysis of flow cytometric data (Simon et al., 2012). In addition, there are platforms and software tools for the integrative and explorative analysis and visualization of data from the different omics levels of PHIs (Horn et al., 2014). PHI-Based Genome and Transcriptome Data The genomic information from the host and the pathogen rep- resents the basis for all further molecular analyses and bioinfor- matic investigations of PHI systems. Thus, genome sequencing is fundamental. It helps to improve diagnosis, typing of pathogen, virulence and antibiotic resistance detection, and development of new vaccines and culture media. Single nucleotide poly- morphism (SNP) typing is important for both identification and characterization of variants of pathogens (strains, clinical isolates) as well as to study the susceptibility of humans for certain infections. In the last decade, there was, and in the future there will be, an explosion of genome sequence data. 1 http://fged.org 2 www.sysmo-db.org 3 www.omnifung.hki-jena.de 4 http://bioconductor.org 5 http://www.cs.waikato.ac.nz/ml/weka 6 www.rapidminer.de Frontiers in Microbiology | www.frontiersin.org April 2015 | Volume 6 | Article 235 | 10 Durmu ̧ s et al. Computational systems biology of PHIs The new sequencing technologies enable small research units to create huge genome datasets at low cost in short time. As a result, handling, comparing, and extracting useful information from millions of sequences becomes more and more challeng- ing, i.e., increased efforts in computational biology are urgently needed. In particular, sequencing is used for genomic and tran- scriptomic characterization of new emerging pathogens. Whole- genome sequencing based phylogenetic studies have implications for understanding the evolution of the PHIs as well as tracking and possibly preventing infection diseases as performed for the Enterotoxigenic Escherichia coli (ETEC), a major cause of infec- tious diarrhea (von Mentzer et al., 2014). Metagenomic and meta- transcriptomic studies of pathogens revealed how pathogenic microorganisms adapt to hosts, e.g., plants (Guttman et al., 2014). The first step of genome sequence analysis, the assembling of genome sequence data into a single genomic contig, may be difficult, in particular due to assembling repeated sequences if reference genomes are not available. Then, additional informa- tion may be required to resolve the remaining DNA regions. The next step, the functional annotation of virulence-relevant pathogens and focusing on host-interaction genes, is often dif- ficult as the genes of interest for PHIs are frequently species- specific and, thus, studies of gene homologies may not be helpful. The situation would be improved by the databases of protein families involved in host interactions, which incorporate the currently used gene names, sequence motifs, gene functions, and experimental results (see section “Web-Based Databases for PHI Systems”). On the other hand, comparative genomics can provide insights into molecular pathogenesis, host speci- ficity, and evolution of pathogens. Next generation sequenc- ing (NGS) has revolutionized the molecular investigation of the diversity of pathogens on the genomic and transcriptomic level. It enables an efficient analysis of complex human micro- floras, both commensal and pathological, through metagenomic methods. Genomic sequences and their annotations are pro- vided through several portals, such as the Genomes Online Database 7 In contrast to the static information from the genome, the transcriptome reflects the dynamics of PHI systems that results in temporal profiles of gene expression with changes in the scale of minutes and hours. More and more, beside the protein-coding mRNAs, also various non-conding small RNAs are investigated. For instance, in Staphylococcus aureus , a lead- ing pathogen for animals and humans, about 250 regulatory RNAs were found (Guillet et al., 2013). Repositories for tran- scriptome data, such as Gene Expression Omnibus 8 (GEO) and ArrayExpress 9 freely distribute microarray and NGS (RNA- Seq) data as well as other forms of high-throughput functional genomics data. In GEO, data from more than 1600 organ- isms, both pathogens and hosts, are accessible. For instance, for the pathogens Mycobacterium tuberculosis , S. aureus , Candida albicans , and Helicobacter pylori transcriptome data from 1,855, 7 https://gold.jgi-psf.org 8 http://www.ncbi.nlm.nih.gov/geo 9 https://www.ebi.ac.uk/arrayexpress 1,777, 1,627, and 1,284 samples are available, respectively. Other data sets monitor the transcriptome of the host’s response, e.g., Homo sapiens and Mus musculus (GSE56091, GSE56093). Some monitor data from host and pathogen simultaneously, e.g., S. aureus and the zebrafish Danio rerio (GSE32119). NGS has opened the door for simultaneous transcriptome analy- sis by the so-called dual RNA-Seq (Tierney et al., 2012a,b; Westermann et al., 2012; Camilios-Neto