Research demonstrates a new framework, utilising large language models (LLMs), automates credit card fraud investigations and generates explanatory reports. Evaluations of 500 cases reveal the system reliably completes investigations in an average of seven steps, reducing analyst workload and improving efficiency in fraud detection processes.
The escalating volume of online commerce presents a persistent challenge for financial institutions, as increasingly sophisticated fraud schemes target stolen credit card information. Analysts tasked with investigating suspicious transactions face a relentless tide of alerts, often leading to diminished effectiveness and alert fatigue. Researchers at Ben-Gurion University of the Negev, including Shaun Shuster, Eyal Zaloof, Asaf Shabtai, and Rami Puzis, address this issue in their work, “FAA Framework: A Large Language Model-Based Approach for Credit Card Fraud Investigations”. They propose a novel framework utilising large language models (LLMs), a type of artificial intelligence trained on vast datasets of text and code, to automate aspects of fraud investigation and generate comprehensive reports. The framework leverages the reasoning and analytical capabilities of LLMs to streamline evidence collection and analysis, potentially reducing the burden on human analysts and improving the efficiency of fraud detection systems.
Analysts increasingly confront a rising tide of e-commerce fraud, creating substantial burdens and contributing to alert fatigue stemming from constant transaction monitoring. Researchers have developed a functional fraud analyst assistant (FAA) framework, leveraging large language models (LLMs) to automate credit card fraud investigations and generate comprehensive reports, directly addressing this escalating challenge. LLMs are advanced artificial intelligence systems trained on vast datasets of text and code, enabling them to understand, generate, and manipulate human language. The FAA integrates these LLM capabilities, including reasoning, code execution, and visual analysis, to streamline the investigative process and empower fraud teams to operate more efficiently.
The FAA framework centres around a collaborative multi-agent architecture, comprising a Fraud Analyst agent, a Vision Agent, a Report Generation Agent (RGA), and an Evidence Quality Assessor, each performing a distinct role to facilitate structured communication and efficient information processing. The Fraud Analyst directs investigations, requesting data analysis and synthesising findings, while the Vision Agent interprets visualisations to identify anomalies and patterns relevant to each case. The RGA meticulously structures the investigation’s progression and evidence, and the Evidence Quality Assessor critically evaluates the reliability and relevance of presented information, providing feedback to refine future investigations.
Researchers designed the FAA framework to address the increasing burden on fraud analysts caused by escalating e-commerce fraud and the subsequent alert fatigue resulting from high volumes of transaction monitoring alerts. The system functions by employing LLMs with reasoning, code execution, and vision capabilities to plan investigations, collect evidence, and analyse findings, ultimately reducing the workload typically handled by human analysts.
The FAA framework demonstrably assists fraud investigation, enhancing efficiency and analytical rigour through a carefully orchestrated multi-agent system. The system’s prompts demonstrate a focus on clear role definition and structured output, ensuring each agent performs its designated task with precision and consistency. The Vision Agent prompt, for example, instructs the agent to act as a seasoned fraud analyst, emphasising the importance of explaining complex visualisations in a step-by-step manner, while the RGA prompt mandates a specific report format, detailing the required structure for documentation. The Evidence Quality Assessor employs a Likert scale – a psychometric scale commonly used in research to measure attitudes or opinions – to provide a rigorous evaluation of the evidence, preventing lenient assessments and ensuring a critical analysis of findings.
Researchers evaluated the FAA framework on 500 credit card fraud investigations, demonstrating its ability to reliably and efficiently conduct investigations averaging seven steps. The FAA framework’s success hinges on the careful design of prompts for each agent, ensuring clear communication and accurate task execution, and structured communication, facilitated by JSON output – a lightweight data-interchange format – further enhances interoperability and integration with existing fraud detection systems. The inclusion of an Evidence Quality Assessor, employing a Likert scale for standardised rating, is crucial for maintaining the system’s accuracy and preventing biased conclusions.
The FAA framework’s multi-agent architecture facilitates a collaborative approach to fraud investigation, enhancing efficiency and analytical rigour. The system’s iterative design, where the FAA requests analysis, the Vision Agent interprets results, the RGA documents the process, and the Evidence Quality Assessor provides feedback, allows for continuous learning and improvement.
Future work will focus on enhancing the FAA framework’s capabilities and exploring new avenues for automation and improvement. Researchers plan to investigate the integration of additional data sources and the development of more sophisticated analytical techniques. They also aim to refine the prompts and algorithms used by each agent to further improve the accuracy and efficiency of the system. The ultimate goal is to create a fully automated fraud detection and prevention system that can effectively protect businesses and consumers from the growing threat of e-commerce fraud.
👉 More information
🗞 FAA Framework: A Large Language Model-Based Approach for Credit Card Fraud Investigations
🧠 DOI: https://doi.org/10.48550/arXiv.2506.11635
