Translation and Image Generation Framework Enhances Accessibility of Indian Poetry, Supporting UN Sustainable Development Goals 4 and 10

Indian poetry, renowned for its intricate language and profound cultural significance, frequently presents substantial challenges to translation and understanding, particularly for those unfamiliar with its nuances. Sofia Jamil, Kotla Sai Charan, and Sriparna Saha, from the Indian Institute of Technology Patna, alongside Koustava Goswami from Adobe Research and Joseph K J, introduce a novel framework to address this issue, enhancing access to this rich literary tradition. Their work centres on a Translation and Image Generation system that accurately translates complex Indian poetry into English and then creates corresponding visual representations, effectively bridging linguistic and cultural gaps. This achievement not only advances the field of computational linguistics but also supports the United Nations’ goals of quality education and reduced inequalities by making culturally significant poetry accessible to a global audience, and is bolstered by the introduction of a new dataset comprising over 1,500 poems in multiple Indian languages.

Poetry to Image Generation with Diffusion Models

This research explores generating images from poetry using diffusion models and large language models. A central challenge lies in capturing both the semantic meaning and aesthetic qualities of poetry within a visual representation. Researchers investigate techniques to improve the quality, coherence, and artistic merit of these generated images, employing both automated evaluation and human preference judgments. The ultimate goal is to create images that are not only visually appealing but also faithfully represent the poem’s content and emotional tone. The core of this work relies on diffusion models, powerful engines for image generation, and large language models which refine poem text to create effective prompts.

Techniques include carefully manipulating prompts, grounding the diffusion model with language model-generated information, and enhancing semantic understanding. Researchers also utilize preference learning and reward models to align image generation with human aesthetic preferences, exploring prompt tuning strategies and multimodal approaches. This research aims to generate higher-quality, more aesthetically pleasing images from poetry, while ensuring the generated images accurately reflect the poem’s meaning and themes. The research utilizes Stable Diffusion alongside large language models like Mistral 7B, Gemma, Qwen2. 5, and Sana for prompt engineering and semantic understanding. Long-CLIP handles long-form text, while techniques like Dreambooth and LoRA enable subject-driven image generation and efficient fine-tuning. ImageReward learns human preferences, RealignDiff improves semantic alignment, and Playground v2.

5 enhances aesthetic quality. PoemTale Diffusion minimizes information loss, ORPO optimizes preferences, and BLIP facilitates language-image pre-training. The research employs automatic metrics and human evaluation to assess image quality, semantic accuracy, and aesthetic appeal. User studies and preference judgments are used to train reward models. In summary, this research bridges the gap between natural language and visual art, leveraging the power of large language models and diffusion models to create compelling and meaningful images that capture the essence of poetic expression. The emphasis on human preference learning and cultural sensitivity demonstrates a commitment to creating AI-generated art that is both aesthetically pleasing and culturally relevant.

Indian Poetry Translation and Image Generation

This work pioneers the Translation and Image Generation (TAI) framework, designed to enhance access to culturally rich Indian poetry for a global audience. Researchers constructed the MorphoVerse dataset, comprising 1,570 poems across 21 diverse Indian languages, to overcome the scarcity of resources for low-resource poetry translation. Data collection involved a team of undergraduate students who verified authenticity from online sources, and rigorous data cleaning ensured consistency. The core of the TAI framework involves a three-stage process: translation, semantic graph construction, and image prompt creation.

The translation module leverages large language models to convert Indian poems into English, prioritizing the preservation of poetic essence and morphological features. Following translation, a semantic graph captures key tokens, dependencies, and metaphorical relationships within the poem’s text, providing a structured representation of its meaning. To generate visually compelling images, the team developed a method for creating appropriate image prompts, incorporating both linguistic information from the translated text and semantic knowledge extracted from the constructed graph. This combined approach aims to produce images that accurately reflect the poem’s meaning, cultural themes, and visual elements, formalizing poem-to-image generation as a text-to-image synthesis task.

Indian Poetry Translation and Visualisation Framework

This work presents a groundbreaking framework for translating and visually representing Indian poetry, addressing a significant gap in accessibility for culturally rich, yet often linguistically complex, verse. Central to this achievement is the creation of the MorphoVerse dataset, a curated collection of 1,570 poems spanning 21 diverse Indian languages, providing a crucial resource previously lacking in the field. The team tackled the challenges of translating morphologically rich poetry by implementing an Odds Ratio Preference Optimization (ORPO) algorithm.

This refined large language model translation process prioritizes poetically meaningful outputs, moving beyond literal translations. Experiments demonstrate that ORPO effectively distinguishes between preferred and unpreferred translation styles, resulting in more accurate and evocative renderings of the poems. To further enhance visual comprehension, the researchers incorporated a semantic graph generation module, which analyzes translated poems to extract key tokens, dependencies, and metaphorical relationships, constructing a network that captures the underlying meaning and conceptual structure. The resulting graph then informs the creation of image prompts, enabling the generation of visually meaningful representations of the poems. The framework successfully captures abstract meanings and subtle details, producing images that accurately reflect the poetic intent. Quantitative analysis confirms the superiority of the TAI Diffusion model in poem image generation tasks, outperforming existing baseline methods.

Poetry Translation and Image Generation Framework

This research presents a novel framework, Translation and Image Generation (TAI), designed to enhance accessibility to the rich heritage of Indian poetry. Recognizing the challenges posed by the linguistic complexity and cultural nuances of these poems, the team developed a two-stage system that accurately translates morphologically rich poetry into English and generates visually meaningful images representing the poem’s content. The framework integrates semantic graph knowledge to construct precise prompts for image generation, significantly improving both translation quality and image accuracy. The team’s approach demonstrably outperforms existing methods in generating images for poetic texts, achieving superior results in both human and quantitative evaluations. A key contribution is the introduction of MorphoVerse, a new dataset comprising 1,570 poems across 21 diverse Indian languages, created to facilitate further research in this area.

👉 More information
🗞 Crossing Borders: A Multimodal Challenge for Indian Poetry Translation and Image Generation
🧠 ArXiv: https://arxiv.org/abs/2511.13689

Rohail T.

Rohail T.

As a quantum scientist exploring the frontiers of physics and technology. My work focuses on uncovering how quantum mechanics, computing, and emerging technologies are transforming our understanding of reality. I share research-driven insights that make complex ideas in quantum science clear, engaging, and relevant to the modern world.

Latest Posts by Rohail T.:

Quantum Computer Speeds up Using a ‘hotter Cools Faster’ Paradox

Quantum Computer Speeds up Using a ‘hotter Cools Faster’ Paradox

February 11, 2026
Quantum Computer Optimisation Cuts Circuit Size by 14,024 Gates

Quantum Computer Optimisation Cuts Circuit Size by 14,024 Gates

February 11, 2026
Exotic Material Switches ‘on’ and ‘off’ Electron Behaviour for Future Devices

Exotic Material Switches ‘on’ and ‘off’ Electron Behaviour for Future Devices

February 11, 2026