Researchers are developing artificial intelligence to rapidly assess structural damage following disasters. Yuqing Gao, Guanren Zhou, and Khalid M. Mosalam, all from the University of California Berkeley, present a new framework, LLM-DRS, which leverages large language models to summarise data collected during disaster reconnaissance. This work addresses a critical gap in current vision-based Structural Health Monitoring systems, which typically provide only raw data requiring significant manual interpretation by engineers. By integrating image analysis with text-based metadata and employing carefully designed prompts, LLM-DRS generates comprehensive summary reports for individual structures or entire affected regions, offering a potentially transformative tool for improving post-disaster resilience of the built environment.
By utilising cameras and CV algorithms to analyse images and videos of a structure, vision-based SHM can detect subtle changes in appearance that may indicate damage or deterioration. Applying DL to vision-based SHM improves the accuracy and efficiency of the process, as DL models can learn to classify and localise visual patterns associated with different types of structural damage. Motivations stem from the rapid development of Artificial Intelligence (AI) technologies, which have benefited many areas through automation, effectiveness, efficiency, and economics. Recent increases in disasters, caused by natural hazards such as earthquakes, tsunamis, and hurricanes, necessitate rapid assessment of the built environment to assist in post-disaster rescue and reduce consequent losses. Many efforts have been performed in rapid assessment and structural health monitoring (SHM) using Computer Vision (CV) and Deep Learning (DL) algorithms to analyse images and videos of a structure to detect subtle changes in appearance that may indicate damage or deterioration. Previous works generated discrete outputs, such as damage class labels and damage region coordinates, requiring engineers to reorganise and analyse these results for further evaluation and decision-making. A large amount of metadata, including geolocation, earthquake magnitude, building types, and overall ratings, is collected during reconnaissance, but has not been fully exploited in previous AI-aided SHM studies. This study proposes a novel Large Language Model (LLM)-based Disaster Reconnaissance Summarization (LLM-DRS) framework to address these limitations. The LLM-DRS framework first introduces a standard reconnaissance plan, where the collection of both vision data, such as images, and corresponding metadata of structures follows a well-designed on-site investigation plan. Text-based metadata and image-based vision data are then processed and matched into one file, where well-trained Deep Convolution Neural Networks from the PEER Hub ImageNet (φ-Net) extract key attributes, such as damage state, material type, and damage levels, from the images. Finally, feeding all data into a GPT model with carefully crafted prompts, the LLM-DRS generates a summary report for an individual structure, or the affected regions based on the attributes and metadata from all investigated structures. The integration of LLMs in vision-based SHM, especially for rapid post-disaster reconnaissance, shows promising results, indicating the potential to achieve resilient built environments through effective reconnaissance. The development of Natural Language Processing (NLP) has progressed through rule-based, statistics-based, machine learning-based, and deep learning-based methods. Since 2017, the introduction of the Transformer, distinguished by its attention mechanism, parallel processing capabilities, and capacity to capture long-range dependencies within natural language sentences, has significantly improved the field of NLP. The Transformer’s architecture, characterised by its flexibility, scalability, and vast number of learnable parameters, facilitated the effective use of the pre-training and fine-tuning paradigm in NLP tasks. Based on the success of the Transformer, several variants, including Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-trained Transformer (GPT) (Radford et al0.2018; Radford et al0.2019; Brown et al0.2020), have proven powerful. Both BERT and GPT are pre-trained on vast corpora using unsupervised learning methods. BERT, derived from the Transformer’s encoder, leverages bidirectional context and performs well on downstream tasks requiring contextual understanding, such as text classification. GPT, inspired by the Transformer’s decoder, excels in generative tasks as a Language Model (LM), with ChatGPT being a well-known fine-tuned incarnation. Recent research on AI-aided SHM and post-disaster reconnaissance has focused on adopting CV methods, especially Convolutional Neural Networks (CNNs), to automate tasks requiring manual labour and key information about a structure’s health condition from photographs. Gao and Mosalam proposed the concept of Structural ImageNet and built the PEER Hub ImageNet (φ-Net), one of the largest open-source image datasets in the SHM area, using it as a benchmark for vision-based SHM task studies. Gao and Mosalam (2020; 2023) further explored the power of the Transformer and developed the Multi-Attribute Multi-Task Transformer (MAMT2) framework to simultaneously perform structural image classification, localisation, and segmentation tasks. The LLM-DRS framework successfully integrates vision data and metadata to generate comprehensive structural health assessment reports. Deep Convolutional Neural Networks, specifically models from the PEER Hub ImageNet (φ-Net), reliably extract key attributes from images, identifying damage state, material type, and damage levels with high consistency. These extracted attributes, combined with collected metadata, form the basis for detailed reconnaissance summaries. The system’s performance relies on carefully designed prompts used to guide the Large Language Model, enabling it to synthesise information and produce coherent reports for individual structures or entire affected regions. This multi-modality allows for a more nuanced and informative analysis of structural health than previously possible, paving the way for more resilient built environments through effective reconnaissance efforts. Prompt engineering proved crucial to the success of the LLM-DRS, with carefully constructed prompts directing the AI model to generate assessment reports tailored to specific needs. The system’s architecture, combining Convolutional Neural Networks, Natural Language Processing, and Large Language Models, demonstrates a novel approach to disaster reconnaissance. The persistent challenge of translating raw data into actionable insights after a disaster has long hampered effective response and recovery efforts. This new framework, integrating Large Language Models (LLMs) with vision-based SHM, represents a shift towards automated reconnaissance, moving beyond simple damage detection to generate coherent, narrative summaries of structural conditions. What distinguishes this work is the emphasis on a standardised data collection process. By combining visual data with carefully recorded metadata, the system creates a richer, more contextualised understanding of damage, allowing the LLM to explain where, what type of damage, and its likely severity in a human-readable report. The potential for streamlining post-disaster assessments, particularly in rapidly evolving situations, is considerable. However, the reliance on well-defined metadata introduces a potential bottleneck; the quality and consistency of on-site data collection will be critical. Future work will likely focus on refining the prompting strategies for LLMs, improving the robustness of image analysis algorithms, and exploring the integration of this system with other data sources, such as drone imagery and sensor networks, to create a truly comprehensive picture of structural resilience.
👉 More information
🗞 A Large Language Model for Disaster Structural Reconnaissance Summarization
🧠 ArXiv: https://arxiv.org/abs/2602.11588
