Subsurface Data Access and Analytics

SIRIUS is building on the Optique platform for ontology-based data access to demonstrate how repositories like DISKOS can be developed into digital platforms for exploration, research and innovation. Once this data is opened up, it needs to be analysed. For this reason we are also working with image analysis, data science and natural language applications in sub-surface data management.

Subsurface Data Access

SIRIUS working on a vision of providing a platform for innovation in the sub-surface. A recent book by Andrew McAfee and Erik Brynjolfsson, Machine Platform Crowd traced the role of platforms as enablers for innovation by crowds of workers. We believe that there is a need to open up subsurface data to researchers and innovators to try out their ideas on real data. We also believe that national data repositories, like DISKOS, have the potential to provide such a platform. However, for this to be done, we need to improve access to the data and allow it to be linked with data in other databases. We also need to improve access to unstructured text information in these databases. SIRIUS has several active projects to address the subsurface data access challenges 

SIRIUS OBDA Pilot

Exploration digital transformation is about overcoming the bottleneck of data access and increasing the quality of interpretations by means of the better use of data. The data access bottleneck is substantial as up to 70% of exploration experts’ time is spent finding, accessing, integrating, and cleaning data before analysis can even start. 

One possible approach to address this challenge is to extend the OBDA (Ontology-based data access) theory and tools to support the data access challenges for the subsurface data. OBDA was extended in the Optique project to meet the needs of the oil & gas industry, but the solution has failed to be adopted due to its technological limitations.

The SIRIUS OBDA subsurface pilot project is addressing these shortcomings and aim to significantly broaden the applicability of the approach for use in subsurface projects. Click here to read more about this project

GeoDataPrep

Oil and gas companies are transitioning towards more data-driven decision-within the subsurface domain. By visualizing large volumes of complex data through dashboards and other forms of business analytics techniques, decision-makers are to make decisions faster and with greater confidence. However, such data-driven decision-making is moot if the time spent preparing subsurface data for analysis and visualization far exceeds the time saved by decision-makers.

The GeoDataPrep project targets data preparation workflows necessary for dashboarding and business analytics in the subsurface domain. Click here [Coming Soon] to read more about this project

SIRIUS Subsurface Lab

Equinor has made a complete set of data from a North Sea oil field (Volve field) available for research, study and development. This dataset consists of a variety of structured and unstructured subsurface data, comprising approximately 40,000 files from the Volve field which was in production from 2008 to 2016. The data has been released to give students and scientists a realistic case to study and support learning, innovation and new solutions for the energy future.

The volume of the information available in this dataset is huge (approximately 5TB, 40,000 files), in proprietary and nonproprietary formats and with limited/missing metadata. This makes it challenging to use this dataset for experimentation. A substantial amount of time is required to make it usable (finding the required information from 40,000 files, reading and transforming formats, and compensating the missing meta data etc.). SIRIUS subsurface Lab project is focused on pre-processing the Vovle dataset and creating a sandbox environment for experimentation. Click here to read more about this project

Subsurface Data Analytics

Faster access to relevant data is of interest only if the data can be used to create insight and drive decisions. The Exploration scoping workshop identified several challenges related to information extraction and structured and unstructured data usage. Some of the key areas we are working on.

Structured Data 

Geoscientists use Reservoir Analogues to estimate the missing or uncertain reservoir parameters. Finding, selecting appropriate analogues, and extracting inferences depend on the team´s experience and limited human capacity.

To address this problem, a PhD project was started in 2017 and completed in 2021. In this project, techniques are developed to identify and quantify formal domain knowledge, thus predicting more accurate parameters for exploration modelling. The main objective is to extend a Machine learning model that can incorporate Oil & Gas domain information and recommend analogues to a reliable extent. Click here to read more about this project

Unstructured Data – Images 

Finding a geological image based on its technical content from a large image database is difficult. Geoscientists use the keyword search on the textual content of source documents to find relevant Images. 

To address this problem, an innovation project was initiated in 2019, and a prototype is developed. This tool supports executing complex queries to find geological images based on the geological content embedded in the images and significantly reduces the time and effort required to find the most relevant images and corresponding documents. Click here to read more about this project

Unstructured Data- Documents   

Domain experts spend a massive amount of time annotating corpora to train supervised statistical learning models for unstructured data. 

To address this problem, a PhD project was started in 2017 and completed in 2021. In this project, techniques are developed to identify and quantify formal domain knowledge, thus predicting more accurate parameters for exploration modelling. The main objective is to extend a Machine learning model that can incorporate Oil & Gas domain information and recommend analogues to a reliable extent. Click here to read more about this project.

Projects in the Subsurface data access and analytics beacon

(click on the Project Name to read more about project)

 

SIRIUS OBDA Pilot
SIRIUS Subsurface Lab
Hierarchy-based Similarity Measures and Embeddings [PhD Project]
SIRIUS Geo-Annotator Low-Resource Adaptation of Neural NLP Models [PhD Project]
GeoDataPrep
SIRIUS Geo-Annotator

SIRIUS Researchers

Adnan Latif, Martin Giese, Dag Hovland, Irina Pene, Michael Heeremans, Farhad Nooralazadeh, Summaya Mumtaz, Laura Slaughter, Martin Georg Skjæveland, Yuanwei Qu, Thomas Østerlie, Oliver Stahl, Ernesto Jimenez-Ruiz

External Collaboration

Birkbeck – University of London, Federal University of Rio Grande do Sul