KR Tutorial on Geometric Ontology Embeddings

By Mena Leemhuis, Postdoctoral Researcher at the Johannes Kepler University Linz, Austria

Recent advances in subsymbolic AI approaches, such as large language models, have increased awareness of the need for interpretability and trustworthiness. Neurosymbolic AI aims at tackling these issues by combining subsymbolic and symbolic approaches. Symbolic information, for example in the form of ontologies, can be used to incorporate background knowledge into the learning process. This can be achieved using ontology embeddings (see [1] for an overview). One way of combining subsymbolic and ontological information is to represent the ontology directly geometrically in the subsymbolic embedding space. This establishes a close link between subsymbolic similarity information and symbolic ontological information, enabling the use of classical ontological reasoning techniques in combination with a “grounding” of the ontological concepts used. Given a dataset and a corresponding ontology, a neural network is trained to produce an embedding in a low-dimensional vector space, in which the instances are represented as points. At the same time, concepts are learned to be represented as convex geometric objects, such that membership of a concept is represented by the respective point being in the region representing the concept. This enables to model logical operations with the help of geometrical ones and thus adds to the symbolic information of the ontology another layer that can be considered as “grounding” or “interpretation” of it. This allows for improvements to the learning process by enforcing ontologically sound results, while still allowing for the use of data regularities for learning. This general principle of representing concepts as convex geometric objects gives rise to a wide research area, both regarding different types of representations, e.g., in the Euclidean space (e.g., [2]), or in the hyperbolic space (e.g., [3]), some are based on boxes [2], some on spheres [4], some on cones [5] and so on. Application areas are, e.g., query embedding (see, e.g., [6]) or knowledge base embedding [7]. These embedding techniques facilitate the establishment of a close relationship between traditional knowledge representation and reasoning strategies and subsymbolic approaches, particularly between ontologies and embeddings. These approaches can be used in several domains, ranging from the biomedical field [8] to creativity [9]. This tutorial provides an overview of such embedding strategies and presents different variants for representing ontological information of varying expressivity. Particular focus is given to discussing how such a technique can increase the interpretability and trustworthiness of a learning approach, as well as the measures that need to be taken to achieve this in practice

Outline

The tutorial is divided into two parts. The first part focuses on knowledge base embeddings to exemplify geometric ontology embeddings. The second part broadens the scope to include several approaches and their practical applications.

Introduction (45 min)

The first half of the tutorial discusses knowledge base embedding (KBE) as a special case of geometric ontology embedding, using it as both a motivating example and to demonstrate the advantages and challenges of such embedding approaches. Knowledge graphs are highly incom- plete sets of (subject, predicate, object)-triples. One way of predicting missing links, thus tackling this incompleteness, is to do KBE: an embedding is learned to embed subjects and objects as points in some vector space and predicates as some geometric operations. Ontological informa- tion is incorporated by modeling concepts as geometric objects, e.g., as boxes [2] or spheres [4]. Geometric regularities can then be used to infer missing links and triples in the knowledge graph. The principles of embedding ontological information into a vector space are exemplified with the help of a KBE-approach based on representing concepts as boxes, as done in, e.g., [2, 10]. In this context, especially the problem of modeling relations is considered.

Towards interpretability and trustworthiness (30 min)

Next, the trustworthiness and interpretability of these KBE approaches is discussed. As discussed in [7, 11], , KBE itself does not guarantee this. We explore the relationship between embedding strategies and the ontologies they can represent. In particular, we focus on the direct relationship between ontology interpretations and embeddings, and on whether a specific embedding can be interpreted as a model of an ontology [11]. In this context, also the tradeoff between expressivity and complexity is discussed. Special attention is paid in this regard to the so-called faithfulness [12, 7], i.e. the ability of the embedding approach to model an unbiased world view.

Break (30 min)

Overview on other approaches (30 min)

The second half of the tutorial begins with an overview of other areas in which this geometric embedding of symbolic information is used, such as query embedding [6] and multi-label learning [13]. The main aim here is to highlight the similarities in the fundamentals and the broad applicability of these geometric embedding approaches. We will also give hints towards other ways of modeling ontological information geometrically, e.g., modeling relations rather than concepts as convex objects, as done, e.g., in [14].

Practical applications (30 min)

These embeddings are not only interesting from a theoretical viewpoint but also used in practice. The biomedical domain will be given special focus, as there are many biomedical ontologies that can be used in this regard. Examples of approaches using these techniques in this domain can be found in [8]. In this context, it will also be discussed how these embeddings can be used as one step in a processing pipeline, e.g., only to model hierarchies, whereas other axioms are modeled by using different techniques, as, e.g., done in [15]. To show the variability of use cases for these embeddings, applications in other domains are showcased, e.g., for creativity [9].

Future work (15 min)

The tutorial concludes with a discussion of future work and possible extensions to these ap- proaches, focusing particularly on those with greater expressivity. Examples include modeling temporal information [16]) and incorporating uncertainty in the form of probabilistic embeddings [17]. Additionally, further application areas of these embeddings are discussed, particularly fo- cusing on the role of geometric ontology embeddings in the context of further developments in neurosymbolic AI.

References

[1] Jiaoyan Chen et al. “Ontology Embedding: A Survey of Methods, Applications and Resources”. In: IEEE Transactions on Knowledge and Data Engineering (2025), pp. 1–20. issn: 2326-3865. doi: 10.1109/tkde.2025.3559023.
[2] Bo Xiong et al. “Faithiful Embeddings for EL++ Knowledge Bases”. In: The Semantic Web – ISWC 2022. Springer International Publishing, 2022, pp. 22–38. doi: 10.1007/978- 3- 031- 19433-7_2.
[3] Yushi Bai et al. “Modeling heterogeneous hierarchies with relation-specific hyperbolic cones”. In: Advances in Neural Information Processing Systems 34 (2021), pp. 12316–12327.
[4] Maxat Kulmanov et al. “EL Embeddings: Geometric Construction of Models for the Description Logic EL++”. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19. International Joint Conferences on Artificial Intelligence Organization, Aug. 2019, pp. 6103–6109. doi: 10.24963/ijcai.2019/845.
[5] Zhanqiu Zhang et al. “ConE: Cone Embeddings for Multi-Hop Reasoning over Knowledge Graphs”. In: 35th Conference on Neural Information Processing Systems (NeurIPS 2021). Ed. by M. Ran- zato et al. Curran Associates, Inc., 2021, pp. 19172–19183.
[6] Hongyu Ren, Weihua Hu, and Jure Leskovec. “Query2box: Reasoning over Knowledge Graphs in Vector Space Using Box Embeddings”. In: 8th International Conference on Learning Repre- sentations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. 2020.
[7] Camille Bourgaux et al. “Knowledge Base Embeddings: Semantics and Theoretical Properties”. In: Proceedings of the TwentyFirst International Conference on Principles of Knowledge Rep- resentation and Reasoning. KR-2024. International Joint Conferences on Artificial Intelligence Organization, Nov. 2024, pp. 823–833. doi: 10.24963/kr.2024/77.
[8] Filip Kronström et al. “Ontology-based box embeddings and knowledge graphs for predicting phenotypic traits in Saccharomyces cerevisiae”. In: 19th Conference on Neurosymbolic Learning and Reasoning. 2025.
[9] Mena Leemhuis and Oliver Kutz. “Introducing Pathomalgametry: Conceptual Blending with Geometric Path-finding and Amalgamation”. In: International Conference on Computational Creativity (ICCC’25), Campinas, Brazil. 2025.
[10] Hui Yang, Jiaoyan Chen, and Uli Sattler. “TransBox: EL++-closed Ontology Embedding”. In: Proceedings of the ACM on Web Conference 2025. WWW ’25. ACM, Apr. 2025, pp. 22–34. doi: 10.1145/3696410.3714672.
[11] Mena Leemhuis and Oliver Kutz. “Understanding the Expressive Capabilities of Knowledge Base Embeddings under Box Semantics”. In: 19th Conference on Neurosymbolic Learning and Reasoning. 2025.
[12] Özgür L. Özcep, Mena Leemhuis, and Diedrich Wolter. “Cone Semantics for Logics with Nega- tion”. In: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intel- ligence, IJCAI-20. International Joint Conferences on Artificial Intelligence Organization, July 2020, pp. 1820–1826. doi: 10.24963/ijcai.2020/252.
[13] Dhruvesh Patel et al. “Modeling Label Space Interactions in Multi-label Classification using Box Embeddings”. In: International Conference on Learning Representations. 2022. [14] Ralph Abboud et al. “BoxE: A box embedding model for knowledge base completion”. In: NIPS’20: Proceedings of the 34th International Conference on Neural Information Processing Systems. Dec. 2020, pp. 9649–9661.
[15] Shib Sankar Dasgupta Dhruvesh Patel. “Representing Joint Hierarchies with Box Embeddings”. In: (2020). doi: 10.24432/C5KS37.
[16] Mena Leemhuis. “Embedding Temporal Description Logic Ontologies by Cone-based Geometric Models”. In: 10th Workshop on Formal and Cognitive Reasoning (FCR-2024) at the 47th German Conference on Artificial Intelligence (KI-2024, September 23 - 27),W¨urzburg, Germany. 2024.
[17] Luke Vilnis et al. “Probabilistic Embedding of Knowledge Graphs with Box Lattice Measures”. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 2018. doi: 10.18653/v1/ p18-1025.
[18] Mena Leemhuis, Özgür L. Özcep, and Diedrich Wolter. “Knowledge Graph Embeddings with On- tologies: Reification for Representing Arbitrary Relations”. In: German Conference on Artificial Intelligence (Künstliche Intelligenz). Ed. by R. Bergmann et al. Lecture Notes in Computer Sci- ence 13404. Springer. Springer International Publishing, 2022, pp. 146–159. doi: 10.1007/978- 3-031-15791-2_13.
[19] Mena Leemhuis, Özgür L. Özcep, and Diedrich Wolter. “Multi-label Learning with a Cone-Based Geometric Model”. In: Proceedings of the 25th International Conference on Conceptual Structures (ICCS 2020). Springer International Publishing, 2020, pp. 177–185. doi: 10.1007/978-3-030- 57855-8_13.
[20] Mena Leemhuis, Özgür L. Özcepp, and Diedrich Wolter. “Learning with cone-based geometric models and orthologics”. In: Annals of Mathematics and Artificial Intelligence 90 (Oct. 2022), pp. 1–37. doi: 10.1007/s10472-022-09806-1.
[21] Özgür L. Özcep, Mena Leemhuis, and Diedrich Wolter. “Embedding Ontologies in the description logic ALC by Axis-Aligned Cones”. In: Journal of Artificial Intelligence Research 78 (Oct. 2023), pp. 217–267. doi: 10.1613/jair.1.13939