The Julia contributors have released Julia version 1.10, which includes a new parser written in Julia, package load time improvements, and improvements in stacktrace rendering. The update also includes parallel garbage collection, Tracy and Intel VTune ITTAPI profiling integration, an upgrade to LLVM 15, Linux AArch64 stability improvements, parallel native code generation for system images and package images, and parallel precompilation during loading time.
Julia 1.10: A Comprehensive Overview
The Julia programming language has recently released its 1.10 version, following three betas and three release candidates. This article provides an in-depth overview of the key features and improvements introduced in this version, including a new parser written in Julia, package load time improvements, enhancements in stacktrace rendering, parallel garbage collection, and integration with Tracy and Intel VTune ITTAPI profiling. The release also includes an upgrade to LLVM 15, stability improvements for Linux AArch64, parallel native code generation for system images and package images, and measures to avoid races during parallel precompilation.
New Parser and Package Load Time Improvements
One of the significant changes in Julia 1.10 is the introduction of a new parser, known as JuliaSyntax.jl, which replaces the default parser previously written in Scheme. This new parser offers several improvements, including increased parsing performance, detailed syntax error messages, and advanced source code mapping. The error messages now provide more specific information, pinpointing the exact location of syntax issues, which is a significant improvement over the previous version.
In addition to the new parser, Julia 1.10 also brings significant improvements to the loading time of packages. This was achieved through a series of optimizations, including improvements to the type system, reduction in invalidations that trigger unnecessary recompilation, moving packages away from Requires.jl to package extensions, and numerous other performance upgrades. These improvements have resulted in more than a 2x package load improvement for large packages.
Stacktrace Rendering and Parallel Garbage Collection
Julia 1.10 introduces enhancements in stacktrace rendering to make them less verbose and easier to read. These improvements include abbreviating parameters with {…} when these would otherwise be excessively long, simplifying the display of keyword arguments in function calls, collapsing successive frames at the same location, and hiding internally generated methods.
The new version also introduces parallel garbage collection, which results in significant speedups on garbage collection time for multithreaded allocation-heavy workloads. The multi-threaded garbage collection can be enabled through the command line option, and the default number of garbage collection threads is set to half of the number of compute threads.
Profiling Integration and LLVM Upgrade
Julia 1.10 has gained additional integration capabilities with the Tracy profiler and Intel’s VTune profiler. These profilers are now capable of reporting notable events such as compilation, major and minor garbage collections, invalidation and memory counters, and more. Profiling support can be enabled while building Julia via specific make options.
The Julia 1.10 release also includes an upgrade to LLVM 15, which brings with it updated profiles for new processors and general modernizations. Noteworthy improvements include the move to the new pass-manager promising compilation time improvements and improved support for Float16 on x86.
Linux AArch64 Stability and Parallel Native Code Generation
With the upgrade to LLVM 15, Julia 1.10 has improved stability for Linux AArch64. This was achieved by using JITLink on aarch64 CPUs on Linux, which resolves frequent segmentation fault errors that affected Julia on this platform.
The new version also introduces parallel native code generation for system images and package images, which speeds up the compilation of system images as well as large package images, resulting in lower precompile times for these. The amount of parallelism used can be controlled by the environment variable JULIA_IMAGE_THREADS=n.
Avoiding Races During Parallel Precompilation
Julia 1.10 introduces a “pidfile” (process id file) locking mechanism that ensures only one Julia process will work to precompile a given cache file. This arrangement benefits both local users, who may be running multiple processes at once, and high-performance computing users who may be running hundreds of workers with the same shared depot. The new version also introduces parallel precompilation during loading time to catch these cases and precompile faster.
Key contributors include Claire Foster, Jameson Nash, Jeff Bezanson, Tim Holy, Kristoffer Carlsson, Diogo Correia Netto, Valentin Churavy, Cody Tapscott, Prem Chintalapudi, Gabriel Baraldi, Mose Giordano, and Ian Butterworth.
