Matchfixagent Achieves 72.8% Functional Equivalence in Repository-level Code Translation Validation and Repair, Exceeding 60.7% Accuracy

The increasing demand for code modernisation and cross-platform compatibility drives significant research into automated code translation, yet ensuring the functional correctness of these translations remains a major challenge. Ali Reza Ibrahimzada, Brandon Paulsen, and Reyhaneh Jabbarvand, alongside Joey Dodds and Daniel Kroening, present MatchFixAgent, a novel framework designed to validate and repair code translations across a wide range of programming languages. This research addresses the limitations of existing methods, which often struggle with generalisation and rely on potentially flawed test suites, by employing a multi-agent system powered by large language models. The team demonstrates that MatchFixAgent achieves remarkably high coverage in verifying translation pairs, accurately identifying discrepancies with prior work in a significant number of cases, and importantly, successfully repairing a substantially greater proportion of faulty translations, marking a considerable advance in the reliability and adaptability of automated code translation tools.

Autonomous Repository-Level Code Translation Validation and Repair

MatchFixAgent is a new system that automatically verifies and corrects code translations between programming languages. This research introduces a method that works independently of the languages involved, ensuring the accuracy and reliability of translated code. The system generates a comprehensive set of tests from the history of the original code, then runs these tests on both the original and translated versions. Discrepancies in the test results highlight potential translation errors, which MatchFixAgent then attempts to diagnose and repair using techniques that pinpoint the source of failures and create targeted corrections, preserving the original intent of the code. Experiments on open-source projects demonstrate that MatchFixAgent accurately identifies and repairs translation errors, significantly reducing the manual effort needed to validate and maintain translated codebases.

Language Models and Software Toolkits Evaluated

The research evaluated several programming languages, large language models, and software toolkits. These included GPT-4o, Claude, Gemini Pro, and OpenAI Codex, alongside benchmarking platforms like BigCodeBench and AI development tools such as OpenHands and Moatless Tools. Supporting software and libraries included tools for mathematical operations, checkdigit calculations, color conversion, and heap queue algorithms, as well as Python’s HTML parsing library. The work also draws upon research from organizations involved in software maintenance, testing, and engineering, including ICSME, ISSTA, FSE, and PLDI, with publications appearing in journals like Science China Information Sciences. Key observations reveal a strong focus on large language models and their application to software engineering tasks, with an emphasis on research, benchmarking, and evaluating LLM capabilities in code-related areas. The projects cover various stages of the software engineering lifecycle, including code translation, improvement, testing, and maintenance, indicating a collaborative effort in this field.

MatchFixAgent Validates and Repairs Code Translations

This work presents MatchFixAgent, a new technique that combines program analysis with large language model agents to automatically validate and repair code translations across multiple programming languages. The system systematically generates targeted tests to demonstrate functional equivalence or identify semantic bugs in translated code, generating reports that aid understanding of the process. The team achieved high accuracy, with the system producing equivalence verdicts for the vast majority of translation pairs and correcting discrepancies with existing techniques in a significant proportion of cases. Notably, MatchFixAgent successfully repaired a substantially higher percentage of inequivalent translations compared to prior work, demonstrating improved adaptability and precision. The system is also cost-effective and scalable, requiring minimal code to support additional programming languages and validating instances relatively quickly. To the best of the authors’ knowledge, MatchFixAgent represents the first approach capable of effectively validating and repairing translations at the repository level across multiple programming languages.

👉 More information
🗞 MatchFixAgent: Language-Agnostic Autonomous Repository-Level Code Translation Validation and Repair
🧠 ArXiv: https://arxiv.org/abs/2509.16187

Rohail T.

Rohail T.

As a quantum scientist exploring the frontiers of physics and technology. My work focuses on uncovering how quantum mechanics, computing, and emerging technologies are transforming our understanding of reality. I share research-driven insights that make complex ideas in quantum science clear, engaging, and relevant to the modern world.

Latest Posts by Rohail T.:

Topology-aware Machine Learning Enables Better Graph Classification with 0.4 Gain

Llms Enable Strategic Computation Allocation with ROI-Reasoning for Tasks under Strict Global Constraints

January 10, 2026
Lightweight Test-Time Adaptation Advances Long-Term EMG Gesture Control in Wearable Devices

Lightweight Test-Time Adaptation Advances Long-Term EMG Gesture Control in Wearable Devices

January 10, 2026
Deep Learning Control AcDeep Learning Control Achieves Safe, Reliable Robotization for Heavy-Duty Machineryhieves Safe, Reliable Robotization for Heavy-Duty Machinery

Generalist Robots Validated with Situation Calculus and STL Falsification for Diverse Operations

January 10, 2026