We have explored a range of different restrictions to DatalogZ, including linearity conditions, stratification, semi-positive programs and stability conditions.
We have shown that reasoning in limit DatalogZ is decidable if a linearity condition restricting the use of multiplication is satisfied. In particular, limit-linear DatalogZ is complete for ΔEXP2 and captures ΔP2 over ordered datasets in the sense of descriptive complexity. We have also provided a comprehensive study of several fragments of limit-linear DatalogZ. We have shown that semi-positive limit-linear programs (i.e., programs where negation is allowed only in front of extensional atoms) capture coNP over ordered datasets; furthermore, reasoning becomes coNEXP-complete in combined and coNP-complete in data complexity, where the lower bounds hold already for negation-free programs. In order to satisfy the requirements of data-intensive applications, we also proposed an additional stability requirement, which causes the complexity of reasoning to drop to EXP in combined and to P in data complexity, thus obtaining the same bounds as for usual Datalog. Finally, we have compared our formalisms with the languages underpinning existing Datalog-based approaches for data analysis and have shown that core fragments of these languages can be encoded as limit programs; this allows us to transfer decidability and complexity upper bounds from limit programs to other formalisms. Therefore, we have provided a unified logical framework for declarative data analysis which can be used as a basis for understanding the impact on expressive power and computational complexity of the key constructs available in existing languages.
Mark Kaminski, Egor Kostylev, Boris Motik, Ian Horrocks
This work was partially supported by the SIRIUS Centre for Scalable Data Access (Research Council of Norway, project 237889).