Large Language Models’ Ontological Capabilities Assessed by New Benchmark Dataset.

Research introduces OntoURL, a benchmark evaluating large language models’ (LLMs) handling of ontologies – formal systems representing knowledge through concepts and relationships. Evaluation across 20 open-source LLMs, utilising 58,981 questions from 40 ontologies spanning eight domains, reveals models excel at understanding ontological knowledge but struggle with reasoning and learning. This demonstrates current LLMs possess limitations in manipulating symbolic knowledge, and establishes OntoURL as a tool for assessing progress in integrating LLMs with formal knowledge systems.

The capacity of large language models (LLMs) to process and apply structured knowledge remains a key area of investigation as these systems become increasingly integrated into complex applications. While adept at pattern recognition and natural language processing, their ability to manipulate formal, symbolic representations of knowledge – known as ontologies – has received less scrutiny. Researchers at the University of Groningen and Leiden University have addressed this gap with the development of a new benchmark, OntoURL, designed to rigorously evaluate LLMs’ ontological capabilities across understanding, reasoning, and learning. Xiao Zhang, Huiyuan Lai, and Johan Bos, from the CLCG at the University of Groningen, collaborated with Qianru Meng from LIACS at Leiden University to present their findings in a paper titled ‘OntoURL: A Benchmark for Evaluating Large Language Models on Symbolic Ontological Understanding, Reasoning and Learning’.

Evaluating Symbolic Reasoning in Large Language Models

Recent work introduces OntoURL, a benchmark designed to systematically assess the capacity of large language models (LLMs) to process and apply formal, symbolic knowledge encoded within ontologies. Ontologies are formal representations of knowledge as a set of concepts within a domain and the relationships between those concepts. This structured approach contrasts with the predominantly statistical methods used in LLM training.

The OntoURL benchmark consists of almost 60,000 questions generated from 40 distinct ontologies, offering a standardised method for evaluating LLM performance across three key dimensions: understanding, reasoning, and learning.

Results indicate that current open-source LLMs demonstrate a degree of competence in understanding ontological knowledge – successfully identifying concepts and relationships as defined within the ontologies. However, substantial limitations become apparent when evaluating reasoning and learning abilities. Models consistently underperform on tasks demanding inferential steps – drawing conclusions from stated facts – or the application of ontological knowledge to previously unseen scenarios. This discrepancy suggests a fundamental difference between an LLM’s capacity for memorisation and its ability to perform genuine symbolic manipulation.

Performance also varies significantly both between different LLMs and across different ontologies. This variability indicates a need for further investigation into the factors influencing performance, such as the complexity of the ontology or the specific training data used for the LLM. The study underscores the necessity for novel approaches to improve LLM reasoning capabilities and facilitate the effective utilisation of structured knowledge.

👉 More information
🗞 OntoURL: A Benchmark for Evaluating Large Language Models on Symbolic Ontological Understanding, Reasoning and Learning
🧠 DOI: https://doi.org/10.48550/arXiv.2505.11031

Quantum News

Quantum News

As the Official Quantum Dog (or hound) by role is to dig out the latest nuggets of quantum goodness. There is so much happening right now in the field of technology, whether AI or the march of robots. But Quantum occupies a special space. Quite literally a special space. A Hilbert space infact, haha! Here I try to provide some of the news that might be considered breaking news in the Quantum Computing space.

Latest Posts by Quantum News:

Random Coding Advances Continuous-Variable QKD for Long-Range, Secure Communication

Random Coding Advances Continuous-Variable QKD for Long-Range, Secure Communication

December 19, 2025
MOTH Partners with IBM Quantum, IQM & VTT for Game Applications

MOTH Partners with IBM Quantum, IQM & VTT for Game Applications

December 19, 2025
$500M Singapore Quantum Push Gains Keysight Engineering Support

$500M Singapore Quantum Push Gains Keysight Engineering Support

December 19, 2025