Performance bottlenecks in code represent a persistent challenge for developers, often requiring painstaking manual optimisation or complex, limited rule-based systems. Yue Wu, Minghao Han from Carnegie Mellon University, and Ruiyin Li, alongside Peng Liang from Wuhan University, Amjed Tahir from Massey University, and Zengyang Li, present a new approach to address this problem, introducing FasterPy, a framework that leverages the power of large language models to automatically improve Python code execution speed. FasterPy distinguishes itself by combining a retrieval-augmented generation technique with a parameter-efficient adaptation method, allowing it to learn from examples of performance improvements without requiring extensive, costly training data. The team demonstrates that FasterPy significantly outperforms existing methods on standard benchmarks, offering a promising pathway towards more efficient and scalable code optimisation tools.
The team demonstrates that FasterPy significantly outperforms existing methods on standard benchmarks, offering a promising pathway towards more efficient and scalable code optimisation tools.
FasterPy, LLMs and Retrieval-Augmented Code Optimization
The team developed FasterPy, a novel framework that leverages the capabilities of Large Language Models (LLMs) to enhance the execution efficiency of Python code, addressing limitations found in traditional and machine learning approaches to code optimisation. Recognizing the shortcomings of manually designed rule-based systems and the data dependency of prior machine learning methods, researchers pioneered a low-cost and efficient solution centered around adapting pre-trained LLMs.
This work centers on combining Retrieval-Augmented Generation (RAG) with Low-Rank Adaptation (LoRA) to achieve significant performance gains in code optimisation tasks. Central to FasterPy is the construction of a knowledge base comprising existing code pairs demonstrating performance improvements, alongside their corresponding performance measurements, which serves as the foundation for the RAG component. The RAG process enables the LLM to retrieve relevant code examples and performance data, providing contextual information to guide the optimisation process, and effectively augmenting the model’s inherent code understanding.
To further refine the LLM’s optimisation capabilities without extensive retraining, the team implemented Low-Rank Adaptation (LoRA), a parameter-efficient technique that introduces a small number of trainable parameters, minimising computational costs and enabling faster adaptation to the specific task of code optimisation. Experiments employed the Performance Improving Code Edits (PIE) benchmark to rigorously evaluate FasterPy’s performance, comparing it against existing state-of-the-art models, and demonstrating superior results across multiple key metrics.
The methodology involved submitting Python code snippets to FasterPy, allowing the system to propose optimisations based on the RAG-enhanced LLM and LoRA adaptation, and then measuring the resulting execution time improvements. This precise measurement approach allowed for a quantitative assessment of FasterPy’s effectiveness in identifying and implementing performance-enhancing code transformations, ultimately showcasing its potential to automate and accelerate the code optimisation process. The resulting FasterPy tool and complete experimental results are publicly available, facilitating further research and adoption within the software engineering community.
FasterPy Optimizes Python Code with LLMs
Scientists have developed FasterPy, a new framework that leverages the power of Large Language Models (LLMs) to optimise the execution efficiency of Python code, delivering a low-cost and efficient solution to a longstanding problem in software engineering. The research addresses the limitations of traditional rule-based methods and machine learning approaches, which often require extensive manual effort or are constrained by specific program representations and training datasets.
FasterPy uniquely combines Retrieval-Augmented Generation (RAG) with Low-Rank Adaptation (LoRA) to enhance code optimisation performance, creating a system capable of automatically improving code runtime. The core of FasterPy is a knowledge base, meticulously constructed from existing code pairs demonstrating performance improvements and their associated measurements, which supports the RAG component. This knowledge base enables semantic similarity matching, allowing the system to identify relevant optimisation suggestions for incoming code.
Experiments demonstrate that FasterPy effectively utilises this retrieved information, feeding it into a LoRA-enhanced LLM to automatically generate optimised code versions, targeting function-level improvements without altering the code’s core functionality. Results on the Performance Improving Code Edits (PIE) benchmark demonstrate FasterPy’s effectiveness, consistently outperforming existing models across multiple key metrics. The team’s work focuses on optimising execution efficiency, specifically measuring runtime performance, and the framework is particularly well-suited for Python code, a language frequently used in artificial intelligence and known to benefit from optimisation due to its inherent runtime overhead.
This breakthrough delivers a promising new direction for automated code optimisation, offering a scalable and adaptable solution for improving software performance. The FasterPy tool and complete experimental results are publicly available, facilitating further research and development in this area.
FasterPy Optimizes Python Code with AI
FasterPy represents a significant advance in automated code optimisation, addressing limitations found in traditional rule-based and machine learning approaches. Researchers developed a framework that leverages the capabilities of large language models to improve the execution efficiency of Python code, offering a low-cost and scalable alternative to existing methods.
The system employs a retrieval-augmented generation technique, combining information from a knowledge base of performance-improving code examples with low-rank adaptation to refine the language model’s performance. Evaluations using a standard benchmark demonstrate FasterPy’s ability to enhance both code execution speed and correctness, establishing its potential for practical application in code optimisation tasks.
The team constructed a substantial dataset of over 60,000 code examples, pairing inefficient code with optimised versions and detailed summaries of the changes made, which forms the foundation of the system’s knowledge base. While the current implementation relies on code semantic similarity for knowledge retrieval, the authors acknowledge that the retrieval module could benefit from models specifically designed for execution efficiency, and they plan to explore rule-based code slicing to pinpoint performance bottlenecks more effectively.
Future work also includes expanding evaluation to real-world code from platforms like GitHub and incorporating lower-level instrumentation techniques to improve the precision of performance measurements, acknowledging that current timing methods can be influenced by external system factors.
👉 More information
🗞 FasterPy: An LLM-based Code Execution Efficiency Optimization Framework
🧠 ArXiv: https://arxiv.org/abs/2512.22827
