Characterizing the patient’s physical or biochemical characteristics as well as any responses to interventions is essential for tailoring treatment and diagnosing the patient. Any observable trait, such as a behaviour, developmental delay or physical anomalies, is what is called a phenotype. Phenotypes are a manifestation of genetic variation and are essential data to collect that form the basis for personalized medicine.
SIRIUS researchers have been working on the BigMed personalized medicine project (2016-2020) to create tools for identifying and specifying clinical phenotypes. Dag Hovland and Laura Slaughter worked on an ontology-driven form-based tool to be used by paediatricians to select phenotypes observed in newborns with suspected rare genetic disorder. The tool uses the Human Phenotype Ontology (HPO) and provides some guidance to selection of phenotypes to be used in the diagnostic process.
SIRIUS researchers met with BIGMED clinicians and lab technicians to outline functionality of the phenotype tool which will be completed in 2019. Functionality includes using the ontology-driven system to track phenotype iterations throughout the diagnostic process, adding triggers that indicate a need for reanalysis, the addition of other input methods than the form check-boxes including free text.
Ernesto Jimenez-Ruiz has contributed to the work on clinical phenotypes through work to find alignments between disease and phenotype ontologies. As part of the Pistoia Alliance Ontologies Mapping Project to find or create better tools for mapping ontologies, he has examined pairwise alignments between the Human Phenotype Ontology (HPO) to Mammalian Phenotype Ontology (MP), the Human Disease Ontology (DOID), and Orphanet Rare Disease Ontology (ORDO). This work can be applied to BIGMED, in the case of newborn rare disease diagnosis, the workflow includes examination of phenotype to possible diagnosis, because the source ontologies are dynamic and change often, automation of the mappings is essential to the maintenance of tools and services developed as part of the project.
Laura Slaughter attended a workshop on automatic identification of clinical phenotypes from EHR texts in Tromsø in June, 2018. Activities include scoping using an open source dataset of electronic health records (MIMIC III dataset) in order to explore extraction of phenotypes automatically, including free text as specified by the clinicians in the BigMed project.
In 2018, Violet Ka I Pun has started investigating the workflow of a prenatal visit in a hospital in Bergen. An initial executable model of the workflow, as well as a preliminary visualisation of the model, has been developed. In addition, Pun has initiated collaboration with the Bergen Regional Health Authority, which agreed to provide a use case for the project. In addition, she is establishing a collaboration regarding scheduling and planning the use of medical equipment for surgical theatres in Bergen. This collaboration also involves the University of Lübeck in Germany, which can provide the knowledge of the interconnection of medical devices.
The BigMed personalized medicine lighthouse project started in 2017. SIRIUS researchers are building an IT framework that supports clinical decision-making in rare monogenic diseases and heart disease. We have completed relevant tasks towards the development of IT tools that are summarized in the following paragraphs.
We have worked in 2017 with understanding of the needs of doctors and lab technicians. This was done through structured meetings to find out what information is available for each clinical/laboratory problem. These meetings provided feedback and insight into information needs and decision-making processes in order to develop IT support for clinicians.
One of the main goals in BigMed is to improve the capture of clinical phenotypes and improve communication between healthcare personnel and laboratory technicians. A phenotype is a description of a patient’s observable characteristics or traits. We examined the use of the PhenoTips, a tool that uses the Human Phenotype Ontology (HPO) to collect and manage phenotype data. We then worked on building knowledge in the use of specific medical ontologies: HPO for phenotyping, OMIM and Gene Ontology for genetics and the Disease Ontology for diseases. We examined how these could be aligned. Ernesto Jimenez-Ruiz compared different algorithms for aligning these ontologies. We also started work on a simple ontology for the information, including phenotype, that is used in the genetic requisition cycle workflow.
The DIPS Electronic Health Records (EHR) system, used in Oslo University Hospital, makes use of Open EHR archetypes. A successful personalized health system will require archetypes and ontologies to work together. Laura Slaughter reviewed the large amount of existing work in this area as a foundation for a possible IT solution in BigMed.
Work was also done on tools and infrastructure. RDF Surveyor, described in more detail below, is a generic tool for browsing triple stores. It was demonstrated at the BigMed plenary meeting June 2017). A specific tool has also been developed for entering HPO terms. This is simpler and easier to use than the PhenoTips system. A public server has also been set up to allow testing of semantic web and linked data applications.
The BigMed project issued a position paper entitled “Big data management for the precise treatment of three patient groups” in January 2018. This summarizes the findings of the first year of the project and presents a summary of the obstacles to personalized medicine in Norway. Barriers are financial, legal, technological and organizational. Success will depend on (a) hospitals being rewarded for innovation, (b) patient consent and privacy being managed in a way that allows access to big data analysis, (c) hospitals and health care organisations sharing data and (d) the implementation of a flexible, open and modern ICT infrastructure.
SIRIUS’ research programs provide skills and tools that support needs in clinical care and precision medicine. Here we review areas where we can contribute, program by program.
The Ontology Engineering program offers ways for clinicians and clinical researchers to find and use data more effectively. Semantic Web technologies are designed to allow open access to databases and datasets. To be accessible, this data must be made available as linked open data. Fortunately, there is much useful linked open data available in healthcare, such as data on drugs, chemical structure related to drugs and genetic disease datasets (OMIM). A catalogue of these datasets has been compiled by the Bio2RDF project. Linked data is presented as a in Resource Description Format (RDF), which is a standard defined by the W3C consortium.
SIRIUS researchers have developed a visual interface to linked data to make it easy for non-technical users to search and browse datasets. This tool allows a medical user to get an overview of the data, analyse the contents of the database and navigate between classes in the data set. Such a tool should be generic and not require any installation on an end-user’s computer. It should also work even with large datasets, containing billions of data values. To the best of our knowledge, there is no tool that meets all these requirements. For example, LodView is a semantic browser that presents RDFdata as tables and allows clicking on links. However, it does not give any overview of the dataset and navigation between classes is not supported.
SIRIUS researchers have created several visual query tools, like PepeSearch, OptiqueVQS and SemFacet, to support faceted search over RDF datasets. These tools can be used to explore this data in order to get a clear picture of its size and content. PepeSearch and OptiqueVQS present a flat list of the classes in the repository as a starting point, while SemFacet asks for a set of keywords as input in order to build a faceted interface to the data.
Previously developed query tools required installation of software on the user’s computer and were not able to cope with large datasets, such as DBpedia. To address these problems, as part of the work related to needs of healthcare practitioners, we designed a webbased tool called RDF Surveyor. This takes a complex, large RDF data source and generates a web-based, navigable overview of each entry in the data source, with all its attributes and links. A demo of this tool is found at http://tools.sirius-labs.no/rdfsurveyor/.
Practitioners then need a tool for linking these general tools to real systems with patient data. This can be done effectively using Ontology-based Data Access (OBDA), as implemented in the Optique EU project. Using OBDA means that it is easy to implement reasoning using the data at the time of data access. It is therefore easier for an average non-technical user to make precise queries using her own terminology, without waiting for a database technician to formulate a complex SQL query.
SIRIUS’ skills and tools are useful in healthcare where data is spread over different legacy systems and various work processes need access to this data. We can demonstrate how complex queries across these datasets can support work processes in healthcare. We can also provide workflow-relevant user interfaces for non-technical healthcare personnel. In addition, researchers from the Industrial Digital Transformation program, social scientists from NTNU, are available to help with design and assessment of how work flows and processes can be made more efficient through the use of new information technologies.
The Data Science program in SIRIUS has proven skills in processing domain-specific language and Norwegian language documents. The goal of the strand is to improve natural language processing for specific languages and technical sub-languages, like the ones found in medical records. Healthcare documentation of patient care contains narrative texts with many abbreviations, Latin terms and jargon, and conventional grammar is often not observed. Much work has already been done in English, but existing tools need to be adapted to Norwegian. Further research in this area focuses on machine understanding of the texts and integration with structured data. This allows clinicians to extract information from texts and then interact with the structured data and knowledge available. The natural language group in Oslo is one of the leading research centres for Norwegian language processing.
The Data Science, Semantic Integration and Scalable Computing programs can provide precision medicine projects with skills and 1nfrastructure for:
Finally, the Analysis of Complex Systems program works with the simulation and analysis of complex plans and systems. We can determine the optimal deployment of computers in a cloud implementation of a health system so that applications function as expected when put online. We can also determine optimal deployment of health workers and equipment in a hospital. This is accomplished by modelling and analysing common hospital operations, ensuring that operating theatres and staff are available, and at the same time, allowing for replanning in the event of unforeseen changes. SIRIUS can contribute to projects run by healthcare management organizations, infrastructure providers and IT providers who want to understand and improve their operational performance. On the local institutional level, healthcare facilities can receive help on how to execute delivery plans.
(click on the Project Name to read more about project)