The escalating challenge of monitoring progress towards the United Nations’ Sustainable Development Goals (SDGs) necessitates innovative approaches to data analysis, particularly within the realm of textual information. Researchers are increasingly turning to large language models (LLMs), a type of artificial intelligence capable of understanding and generating human language, to automate the classification of text relating to these global objectives. A collaborative study, detailed in a forthcoming publication, investigates the performance of several LLMs – both proprietary and open-source – when applied to the task of identifying text relevant to the 17 SDGs. Andrea Cadeddu, Alessandro Chessa, and Vincenzo De Leo from Linkalab s.r.l., alongside Gianni Fenu, Diego Reforgiato Recupero, and Luca Secchi from the University of Cagliari, partnered with Enrico Motta and Francesco Osborne from the Knowledge Media Institute at The Open University to conduct a comparative analysis of task adaptation techniques, including zero-shot, few-shot learning, and fine-tuning. Their work, entitled “A Comparative Study of Task Adaptation Techniques of Large Language Models for Identifying Sustainable Development Goals”, demonstrates that strategically optimised smaller models can achieve comparable results to larger, more computationally intensive systems like OpenAI’s GPT.
The escalating challenge of monitoring progress towards the United Nations’ 17 Sustainable Development Goals (SDGs) necessitates innovative approaches to data analysis, given the sheer volume and complexity of relevant information. Researchers are investigating the application of language models (LLMs), a type of artificial intelligence that processes and generates human language, to automate and enhance SDG text classification. This involves categorising textual data according to its relevance to specific goals, utilising a single-label, multi-class task where each text segment is assigned to only one SDG.
The research team initially explored zero-shot learning, a technique where LLMs perform tasks without prior training examples, evaluating its potential for classifying SDG-related text based solely on the model’s pre-existing knowledge acquired during its initial training on vast datasets. They then investigated few-shot learning, which involves providing the model with a limited amount of labelled training data to guide its classification process. This contrasts with traditional machine learning approaches that require substantial labelled datasets.
Researchers found that appropriately optimised smaller LLMs can achieve comparable performance to larger models for SDG text classification, offering a potentially more accessible and resource-efficient solution for SDG monitoring. Prompt engineering, the careful crafting of input text to guide the model’s reasoning, plays a crucial role in achieving optimal results. Effective prompts provide context and instructions, influencing the model’s output.
While zero-shot and few-shot learning offer faster and more cost-effective deployment, fine-tuning consistently yields the highest performance. Fine-tuning involves training an existing LLM on a specific, labelled dataset, adjusting the model’s internal parameters to maximise accuracy for the target task. Researchers meticulously curated a high-quality dataset of labelled SDG-related text and used this to fine-tune the LLMs, optimising the model parameters to maximise accuracy.
The balance between development speed and performance optimisation remains a key consideration, requiring careful evaluation of the trade-offs between different approaches. Researchers acknowledge that the optimal approach depends on the specific requirements of the application, including the availability of labelled data, the desired level of accuracy, and the constraints on time and resources.
Future research should explore techniques for improving the robustness and generalisability of LLMs for SDG monitoring, addressing challenges such as data bias, domain adaptation, and adversarial attacks. LLMs are susceptible to biases present in the training data, potentially leading to skewed or unfair classifications. Researchers emphasise the importance of developing techniques for mitigating these biases and ensuring equitable outcomes. Domain adaptation, the ability to apply a model trained on one type of data to a different but related domain, is also crucial for ensuring the model’s effectiveness across diverse SDG contexts.
The study’s findings have significant implications for the broader field of sustainable development, demonstrating the potential of LLMs to accelerate progress towards the SDGs. Researchers envision a future where LLMs are used to automate and enhance a wide range of tasks related to SDG monitoring, including data collection, analysis, and reporting.
In conclusion, this study demonstrates that appropriately optimised smaller LLMs can offer a potentially more accessible and resource-efficient solution for SDG monitoring. The findings underscore the importance of prompt engineering, fine-tuning, and addressing challenges such as data bias and domain adaptation, highlighting the potential of LLMs to accelerate progress towards a more sustainable and equitable future. Continued investment in research and development is crucial to unlock the full potential of LLMs for addressing the world’s most pressing challenges.
👉 More information
🗞 A Comparative Study of Task Adaptation Techniques of Large Language Models for Identifying Sustainable Development Goals
🧠 DOI: https://doi.org/10.48550/arXiv.2506.15208
