Page 56 - Sirius_Annual_Report_2021
P. 56
Defended PhDs
Summaya Mumtaz
Summaya Mumtaz
Main research findings: The real-world application of Artificial Intelligence/machine learning techniques is challenging. Most of the standard machine learning approaches depend heavily on large amounts of historical data. However, in real-world complex use cases, the
data vary across several dimensions which makes
56 |
SIRIUS ANNUAL REPORT 2021
it challenging to find a sufficient amount of quality data. Particularly, in low-resource domains, not enough training data is available, which affects the machine learning model’s performance. In many disciplines a significant amount of prior knowledge about the domain is available, often in the form of a taxonomy or a hierarchy. For instance, a disease hierarchy in the medical domain that classifies diseases
into different groups based on similar symptoms. We have experimented in three domains by adding domain knowledge: recommending hydrocarbon reserves in the oil and gas industry, grouping similar words in natural language, and patient mortality prediction in health care domain. Our research
has shown that addition of domain knowledge (taxonomy) in the given scenarios where little training data is available, can improve the performance of the prediction task.