Tensor methods, optimisation techniques that utilise increasingly complex derivatives to find solutions, promise significant improvements in computational efficiency as their order increases. Karl Welzel, Yang Liu, and Raphael A. Hauser, all from the Mathematical Institute at the University of Oxford, alongside Coralia Cartis, now demonstrate local convergence for these adaptively regularised tensor methods, a crucial step towards realising their full potential. This work extends existing theory to functions with more general properties and, importantly, to methods that do not require prior knowledge of key parameters, offering the first established local convergence rates for adaptive regularisation. The team reveals surprising challenges arising from the use of non-convex local models and highlights how the choice of minimiser within the algorithm critically impacts convergence speed, confirming superlinear convergence for certain complex problems and establishing limits on achievable convergence rates.

Higher-Order Regularization Convergence Properties Explained

This research investigates the local convergence of higher-order regularization methods, techniques used to solve unconstrained optimization problems. Scientists aimed to understand when and how these methods converge, and what factors influence their performance. A central focus is the importance of selecting the right local model minimizer during the regularization process. The authors demonstrate that convergence results previously established for convex functions also hold for a broader class of functions exhibiting local uniform convexity. Furthermore, they strengthen existing results, showing that the convergence rates achieved by these methods can be improved to p/(q-1)th-order convergence during successful iterations, where p and q relate to the method’s order and the function’s smoothness.

Importantly, they prove this is the best possible rate under certain conditions. The team identifies two important characteristics of a suitable minimizer: a persistent minimizer, which remains stable as the regularization parameter increases, and an asymptotically successful minimizer, which leads to a successful iteration according to specific criteria. They also demonstrate that choosing the wrong minimizer inevitably leads to oscillations in the regularization parameter. The study defines key concepts such as regularization methods, local uniform convexity, and the order of convergence, providing a rigorous analysis of the local convergence behavior of higher-order regularization methods and emphasizing the critical role of selecting the right local model minimizer.

Adaptive Taylor Methods Achieve Sharp Convergence Rates

Scientists have developed a rigorous analytical framework for adaptive tensor methods, a class of optimization techniques that leverage high-order derivatives to efficiently find minima of complex functions. Researchers extended existing local convergence results to functions exhibiting local uniform convexity, and crucially, to fully adaptive methods that do not require prior knowledge of a function’s Lipschitz constant. This advancement provides the first sharp local convergence rates for adaptive tensor methods, denoted as ARp. The core of the methodology involves constructing a local model at each iteration, approximating the objective function with its pth-order Taylor expansion plus a regularization term.

This model is then minimized to generate a potential new iterate, and the success of the iteration, determined by sufficient decrease in the function value, dictates whether the new iterate is accepted or rejected. A key innovation lies in the adaptive adjustment of the regularization parameter, which is decreased upon successful iterations and increased otherwise, allowing the method to dynamically refine its search. Scientists analyzed the behavior of the ARp method under conditions of local uniform convexity, a property ensuring the function grows at least like O(∥x − x∗∥q) around any stationary point x∗. To establish convergence rates, the team rigorously analyzed the impact of using the global minimizer of the local model at each step. They demonstrated that for p greater than q-1, the method achieves local convergence with order p/(q-1), meaning both the function value and gradient norm converge superlinearly. This work builds upon previous analyses, extending their results to more general functions and adaptive regularization schemes.

Optimal Convergence of Higher-Order Tensor Methods

Scientists have achieved significant advances in higher-order optimization methods, specifically through the development of tensor methods that utilize derivatives up to a specified order. These methods, known as ARp, construct a local model at each iteration by combining a pth-order Taylor expansion with a regularization term, then minimize this model to find a potential new solution. The research team demonstrated that the ARp method possesses optimal global complexity, requiring at most O(ε−(p+1)/p) iterations to find an approximate stationary point with a gradient norm bounded by ε, and this bound improves as the order, p, increases. The study extends previous local convergence results to encompass both locally uniformly convex functions and fully adaptive methods, eliminating the need for prior knowledge of a Lipschitz constant.

Researchers obtained a crucial result demonstrating that during successful iterations, gradient norms converge with order p q−1 when p is greater than q −1, where q represents the degree of uniform convexity. Experiments confirm that the adaptive regularization parameter within the ARp method allows for superlinear convergence for certain degenerate problems, provided that the order, p, is sufficiently large. The team established sharp bounds on the order of convergence, revealing that the method can achieve a convergence rate of p q−1 under specific conditions. For comparison, Newton’s method achieves quadratic convergence for uniformly convex functions of order q=2, while the ARp method, with access to higher derivatives, can achieve pth-order convergence, improving the coefficient of the leading-order term in the iteration count. This improvement is even more pronounced when the Hessian at the solution is singular, where Newton’s method converges linearly, but ARp maintains superlinear convergence if p is greater than q −1.

Adaptive Tensor Methods Achieve Local Convergence

This research extends the understanding of tensor methods, a class of optimization algorithms that utilize higher-order derivatives to efficiently find solutions to complex problems. Scientists have demonstrated local convergence of these methods, even when applied to functions with less restrictive properties than previously understood, and for fully adaptive methods that do not require prior knowledge of a key parameter, the Lipschitz constant. This work establishes the first local convergence rates for adaptive tensor methods. The team investigated the behavior of these algorithms when dealing with non-convex problems and situations where the local model used for optimization does not have a unique solution.

Results indicate that the choice of local model minimizer is critical; using the global minimizer does not guarantee consistent improvement at every step. However, when the appropriate local model minimizer is selected, the higher-order convergence properties are maintained, and the methods can achieve superlinear convergence for certain challenging problems, provided the order of the method is sufficiently high. The researchers also provide bounds on the expected convergence rate in these scenarios. They highlight the need for continued research to explore the behavior of these methods further.

👉 More information
🗞 Local Convergence of Adaptively Regularized Tensor Methods
🧠 ArXiv: https://arxiv.org/abs/2510.25643

Tags:

adaptive methods higher-order methods Lipschitz constant local convergence superlinear convergence Taylor Expansion tensor methods uniformly convex functions

Adaptively Regularized Tensor Methods Achieve Local Convergence on Locally Convex Functions

Higher-Order Regularization Convergence Properties Explained

Adaptive Taylor Methods Achieve Sharp Convergence Rates

Optimal Convergence of Higher-Order Tensor Methods

Adaptive Tensor Methods Achieve Local Convergence

Rohail T.

Latest Posts by Rohail T.:

Drive-Jepa Achieves Multimodal Driving with Video Pretraining and Single Trajectories

Leviathan Achieves Superior Language Model Capacity with Sub-Billion Parameters

Geonorm Achieves Consistent Performance Gains over Existing Normalization Methods in Models