A study by researchers at the University of Edinburgh reveals that advanced AI systems struggle to accurately interpret analog clocks and calendars, despite excelling in complex tasks like essay writing and art generation. The team tested multimodal large language models (MLLMs) on time-related questions using images of clocks with varying designs, including Roman numerals, stylized hands, and different colored dials, as well as calendar-based queries.
Results showed that AI systems correctly identified clock positions less than 25% of the time and made errors in date calculations one-fifth of the time. The findings, which highlight fundamental gaps in AI’s ability to perform basic human tasks, will be presented at the Reasoning and Planning for Large Language Models workshop at the Thirteenth International Conference on Learning Representations (ICLR) in Singapore on April 28, 2025.
AI Struggles with Analogue Clocks and Calendars
A study by researchers at the University of Edinburgh revealed that advanced AI systems struggle to accurately interpret analogue clocks and calendars. Despite their ability to perform complex tasks like writing essays and generating art, these models often fail to correctly identify clock-hand positions or answer questions about dates on calendars. The research highlights a gap in AI’s capacity to combine spatial awareness, context, and basic mathematical reasoning—skills humans acquire early in life.
The team tested state-of-the-art multimodal large language models (MLLMs) by asking them to interpret images of clocks with various designs, including Roman numerals, different colored dials, and the presence or absence of second hands. The results showed that AI systems correctly identified clock positions less than 25% of the time, with error rates notably higher when clocks incorporated stylized elements such as Roman numerals or decorative hands. Removing the second hand did not significantly improve performance, underscoring persistent challenges in detecting hand positions and interpreting angles.
The study also examined AI models’ proficiency in calendar-based tasks, such as identifying holidays or calculating past and future dates. Even the most accurate systems made errors in date calculations approximately 20% of the time. These findings underscore fundamental limitations in AI’s ability to manage seemingly straightforward yet contextually rich real-world tasks despite their advanced capabilities in other domains.
The researchers stress that addressing these challenges is essential for advancing AI integration into time-sensitive applications such as scheduling assistants, autonomous robots, and tools designed for individuals with visual impairments. The study will be presented at the Reasoning and Planning for Large Language Models workshop during the Thirteenth International Conference on Learning Representations (ICLR) in Singapore, highlighting the importance of ongoing research to enhance AI capabilities in handling time-related tasks effectively.
Implications for Real-World Applications of AI Systems
The ability of AI systems to accurately interpret time is critical for their integration into real-world applications. The study highlights significant challenges in this area, with AI models demonstrating low accuracy when determining clock positions and answering date-related questions. These limitations underscore the need for improved reasoning and contextual understanding in AI systems to handle tasks requiring precise temporal information.
The research emphasizes that overcoming these challenges is essential for advancing AI in time-sensitive domains such as scheduling assistants, autonomous robots, and tools for individuals with visual impairments. The findings suggest that current AI models struggle with seemingly simple yet contextually rich tasks, indicating a need for further development in their ability to accurately process and interpret temporal data.
The implications of these limitations are clear: AI’s full potential in practical applications remains unrealized without addressing these gaps. The study’s presentation at the Reasoning and Planning for Large Language Models workshop during the Thirteenth International Conference on Learning Representations (ICLR) in Singapore underscores the importance of ongoing research to improve AI capabilities in handling time-related tasks effectively.
More information
External Link: Click Here For More
