A comprehensive understanding of a disease requires a systems view in the context of the whole human body, covering from the molecular level, cell level, to the organ and system level. Recent developments of technologies enable direct assay of patient’s genomic, proteomic and metabolomic profiles from multiple tissues. These datasets, together with physiological and clinical information of patients and the accumulated knowledge about diseases, provide an opportunity to develop multi-scale modeling of complex diseases. In this project, we propose a systematic approach to build a multi-scale disease model as well as methods and computational tools for follow-up analysis. For a chosen disease, we first build a disease-specific knowledge base by collecting medical entities and phenomena relevant to the disease using natural language processing (NLP) tools on biomedical literature. The collected multi-level knowledge form a complex interaction network among all biomedical entities of the disease. We then overlay experimental data of molecular and clinical measurements obtained from patients on this complex interaction network. After a disease model is verified and finalized, we are equipped to develop applications tools for prediction and diagnosis.
Most systems biology studies to date focus on lower organisms and models on specific pathways or local systems. This project attempts to integrate computational tools and methods, and abundant multi-level data to model human diseases on the whole picture. The systems view of human diseases will help understand the important genetic, genomic and cellular pathways and biological processes that are associated with diseases, which are critical to the development of improved strategies for diagnosis, prevention and therapeutic intervention.
Step 1. Building a disease-specific knowledge base. We use literature mining tools to automatically extract biomedical entities and relations from literature. To improve information access including smarter queries, we adopt semantic web for better data integration, storage and collaboration. Information is stored in triples that describe a statement or relation comprised of subject, predicate and object. All the triples form a network that enables not only visualization but also inferring relations between entities.
Step 2. Developing a multi-scale model. Based on the knowledge base built in Step 1, we start examining interconnections between all biomedical entities and model upon them using different modeling techniques. Multi-level experimental data are incorporated into the model by adding interconnections based on data statistics.
Improve the quality of disease-specific knowledge bases by including more literature sources and combining with existing data bases.
Further verify the disease-specific models by collecting more multi-level experimental and clinical data.
Apply the model to the disease of our interest, for example, inflammatory diseases to infer important pathways, predict disease responses, and find new therapeutic targets.
This project is in collaboration with and supported by Glue Grant.
Jaclyn Chen1,2, Junhee Seok1, Ron Davis1, Wenzhong Xiao1
1.Genome Technology Center, 2. Department of Electrical Engineering, Stanford University