SIRIUS OBDA Subsurface Pilot
The SIRIUS OBDA subsurface pilot project is addressing the shortcomings of Ontology-based data access and aims to significantly broaden the applicability of the approach, with a focus on subsurface applications.
The SIRIUS OBDA subsurface pilot project is addressing the shortcomings of Ontology-based data access and aims to significantly broaden the applicability of the approach, with a focus on subsurface applications.
Subsurface digital transformation is about overcoming the bottleneck of data access and increasing the quality of interpretations by means of the better use of data. The data access bottleneck is substantial as up to 70% of subsurface experts’ time is spent finding, accessing, integrating, and cleaning data before analysis can even start. (Putting the FOCUS on Data, W3C Workshop on Semantic Web in Oil & Gas Industry, Jim Crompton).
Viewed from the Geoscientist, it is hard to get an overview of all available data related to an area of interest, as this data is spread over different applications and many internal and external data sources. No unified view is, as a rule, available up front, though Project Data Managers (PDMs) assist. It is difficult to extract data from databases; should complex queries have to be written, Central Data Managers (CDMs) typically assist. It is challenging to extract data and information based on geological and petrophysical attributes (see the user scenario example below) as it is not possible to execute these types of queries simultaneously on multiple data sources. It is challenging to integrate datasets before analysis can start: this is often tedious manual work that the geoscientists must do themselves. It is incredibly difficult to extract data and knowledge from the text documents as there are very few tools that can deal with the contents of unstructured documents and reports. Geoscientists are well aware of the limitations of the workflow. As a result, valuable analyses on data are too often not performed, and possibilities in data are too often not detected.
It is urgently needed to build competence and tools for the exploration data wrangling. An exploration data wrangler has competence in both geoscience and digital technologies. This competence is crucial to integrate the workflows of geoscientists, PDMs and CDMs and plays an important role in enabling digital transformation in exploration work practices. Along with supporting exploration teams with the routine data access tasks, the data wranglers will efficiently exploit opportunities brought by new IT technologies. This includes efficient handling of critical tasks such as identifying relevant data sources, developing complex ad-hoc queries over federated databases, and retrieving information from reports stored as text documents. In these ways, the data wrangler can bring data much closer to the project teams and give geoscientists a radically better possibility of extracting data and information with the exact specification (in terms of complex geological and petrophysical attributes) they need for their subsurface evaluation.
For the data wrangler to be less dependent on the CDMs and PDMs than the geoscientists are today we need to capture the specialknowledge of the CDMs and PDMs and buildt his into data wrangling tools. Asuccessful attempt in this direction was Optique, a 14M Euro EU project that finished in 2016. Optique showed that geoscience knowledge could be reliably captured in a knowledge graph (or an ontology) and reusable mappings from CDMs could efficiently connect this knowledge graph to data in databases. Optique then demonstrated that complex queries over several federated data sources (including EPDS, NPD FactPages, Open Works installations, GeoChemDB, CoreDB and DDR) could be easily written and efficiently executed. Since the process was fully automated, tasks that normally would take several days could, with the Optique platform, be performed in minutes.
Optique showed the potential to transform the way data is gathered and analyzed by streamlining the workflow and making it more user-friendly. However, Optique has also revealed shortcomings that impede the realization of its full potential: (i) limitation to relational databases, (ii) lack of built-in support for quantitative analytics, (iii) lack of access to unstructured data, and (iv) limited tool support for constructing and maintaining the necessary ontology and mappings.
The SIRIUS OBDA subsurface pilot project is addressing these shortcomings and aim to significantly broaden the applicability of the approach for use in subsurface projects.
Demo http://158.39.75.9:8443 (only accessible by Norwegian Universities IPs)
Using OBDA to formulate and execute complex query on Volve dataset. Click here
Adnan Latif [Contact Person], Martin Georg Skjævland, Dag Hovdland
Roman Kontchakov
This work was supported by the SIRIUS Centre for Scalable Data Access (Research Council of Norway, project 237889).
G&G dataset from Volve Field, Publicly released by Equinor under CC BY 4.0. Equinor and the Volve license partners | https://data.equinor.com