Page 45 - Sirius_Annual_Report_2021
P. 45
are well acquainted with and have developed robust and reliable methods to handle. However, these methods and practices do not scale to the high volumes of data used for data analytics. The initial stages of this research project showed that dashboarding projects could spend several man-years just harmonizing existing naming variability within large sets of well logs to the extent that the algorithms used can run and give useful results.
See more details on this project at page 43.
Project 3: SIRIUS Subsurface Lab
Equinor has released a big dataset from the Volve field; this dataset consists of a variety of structured and unstructured subsurface data. This can be used to establish a subsurface laboratory that can be further used to prototyping various projects and run experiments requiring Multiphysics data.
In 2021, we cleaned this dataset and transformed it into a relational database. This database is now being used in several SIRIUS and DigiWell projects (as a G&G database). In 2022, we aim to deploy an API to access this database and create a sandbox environment where researchers can connect their prototypes directly to this database and run experiments. Even after the SIRIUS life, this work will produce a long-term asset for data science, computer science, and geoscience researchers.
Subsurface Data Analytics
Faster access to relevant data is of interest only if the data can be used to create insight and drive decisions. The Explo- ration scoping workshop identified several challenges related to information extraction and structured and unstructured data usage. Some of them are given below.
Unstructured Data- Documents Domain experts spend
a massive amount of time annotating corpora to train supervised statistical learning models for unstructured data.
Unstructured Data – Images Finding a geological image based on its technical content from a large image database is difficult. Geoscientists use the keyword search on the textual content of source documents to find relevant Images.
Structured Data Geoscientists use Reservoir Analogues to estimate the missing or uncertain reservoir parameters. Finding, selecting appropriate analogues, and extracting inferences depend on the team´s experience and limited human capacity.
To address problem number 1, a PhD project was recently completed on domain adopted knowledge extraction from Oil and Gas documents. The methodology developed in this project supports a significant reduction in the time and effort required in creating training sets using domain adap- tation techniques.
For problem number 2, an innovation project was initiated in 2019, and a prototype is developed. This tool supports executing complex queries to find geological images based on the geological content embedded in the images and significantly reduces the time and effort required to find the most relevant images and corresponding documents.
For problem number 3, a PhD project was started in
2017 and completed in 2021. In this project, techniques are developed to identify and quantify formal domain knowledge, thus predicting more accurate parameters for exploration modelling. The main objective is to extend a Machine learning model that can incorporate Oil & Gas domain information and recommend analogues to a reliable extent.
SIRIUS ANNUAL REPORT 2021 | 45