Using Reinforcement Learning to Optimize Music Generation Based on Aesthetic Ratings

On April 23, 2025, researchers Nicolas Jonason, Luca Casini, and Bob L. T. Sturm published SMART: Tuning a symbolic music generation system with an audio domain aesthetic reward, exploring how reinforcement learning, guided by Meta’s Audiobox Aesthetics ratings, can refine piano MIDI models to produce more appealing compositions while balancing diversity in output.

The study investigates using aesthetic rating models to fine-tune a symbolic music generation system via reinforcement learning. Using group relative policy optimization, the researchers fine-tuned a piano MIDI model with Meta Audiobox Aesthetics ratings as rewards. The optimization improved low-level generated output features and increased average subjective ratings in a listening test. However, over-optimization significantly reduced diversity in model outputs.

Recent advancements in machine learning have significantly enhanced models’ ability to generate high-quality symbolic music, such as MIDI files or sheet music. A notable development is the use of large language models (LLMs) trained with specialized musical knowledge, exemplified by Notagen. This approach has demonstrated superior performance in terms of musicality and creativity compared to existing methods.

In experiments, various soundfonts were utilized, including MuseScore, FluidR3, Grandeur, and Yamaha. These collections of sounds are crucial for accurately reproducing intended musical nuances, impacting the perceived quality of generated music. The choice of soundfont can significantly affect how realistic and expressive the output sounds.

Notagen was trained on a diverse dataset featuring works by composers like Chopin, Mozart, and Philip Glass, ensuring varied and nuanced music generation. Evaluations using a linear mixed-effects model revealed that Notagen’s generated music received higher ratings than other systems, with statistically significant results (p < 0.001). This suggests that users perceive Notagen’s output as more appealing or higher quality.

Looking ahead, integrating real-time feedback and multi-modal approaches could enhance interactivity and creativity. Techniques to preserve distinct musical styles while allowing for innovation are essential, ensuring the model doesn’t blend genres into an indistinct mix. Additionally, methods like Clamp 3 may help maintain coherence across different aspects of music generation.

Notagen represents a significant advancement in symbolic music generation by leveraging LLMs with specialized training. While promising, further details on technical aspects and creativity metrics would provide deeper insights into the model’s capabilities. This innovation opens exciting possibilities for future developments in AI-generated music, offering potential for both artistic expression and practical applications.

👉 More information
🗞 SMART: Tuning a symbolic music generation system with an audio domain aesthetic reward
🧠 DOI: https://doi.org/10.48550/arXiv.2504.16839

Quantum News

Quantum News

As the Official Quantum Dog (or hound) by role is to dig out the latest nuggets of quantum goodness. There is so much happening right now in the field of technology, whether AI or the march of robots. But Quantum occupies a special space. Quite literally a special space. A Hilbert space infact, haha! Here I try to provide some of the news that might be considered breaking news in the Quantum Computing space.

Latest Posts by Quantum News:

Heilbronn University Integrates 5-Qubit IQM Quantum Computer for Research & Education

Heilbronn University Integrates 5-Qubit IQM Quantum Computer for Research & Education

January 21, 2026
UK Reimburses Visa Fees to Attract Global AI and Tech Talent

UK Reimburses Visa Fees to Attract Global AI and Tech Talent

January 21, 2026
Department of Energy Seeks Input to Train 100,000 AI Scientists & Engineers

Department of Energy Seeks Input to Train 100,000 AI Scientists & Engineers

January 21, 2026