AGNBoost, a machine learning framework utilising XGBoostLSS, accurately identifies active galactic nuclei (AGN) and estimates redshifts from NIRCam and MIRI photometry. Testing against both simulated and observed data from the MEGA survey demonstrates outlier fractions of 7.4% with a normalised mean absolute deviation of 0.011 for redshift estimation.
Identifying active galactic nuclei (AGN) – supermassive black holes actively accreting material at the centres of galaxies – is crucial for understanding galaxy evolution and the broader cosmic landscape. Distinguishing AGN from other sources, however, presents a significant observational challenge. Researchers are now employing machine learning techniques to address this, leveraging the enhanced capabilities of the James Webb Space Telescope (JWST). A collaborative team, comprising Kurt Hamblin, Allison Kirkpatrick, Bren Backhaus, Gregory Troiani, and colleagues from institutions including the University of Kansas, the Center for Astrophysics | Harvard & Smithsonian, and various departments of Physics, detail their new framework, ‘AGNBoost’, in a forthcoming publication. AGNBoost utilises the XGBoostLSS algorithm to analyse photometric data from JWST’s Near-Infrared Camera (NIRCam) and Mid-Infrared Instrument (MIRI), simultaneously predicting both the AGN contribution to mid-infrared emission and estimating redshift.
AGNBoost: A Machine Learning Framework for Active Galactic Nuclei Identification
AGNBoost is a novel machine learning framework designed to identify active galactic nuclei (AGN) and estimate their redshifts using near- and mid-infrared photometry from current astronomical instruments. The system constructs 121 input features from seven NIRCam and four MIRI bands, incorporating magnitudes, colour indices, and squared colour terms, to simultaneously predict the fraction of mid-infrared emission originating from an AGN power law and photometric redshift. It utilises the XGBoostLSS algorithm, representing an improvement in automated AGN characterisation and cosmological distance estimation.
The framework trains models on simulated galaxies generated by the CIGALE code – a tool that models the spectral energy distribution of galaxies – providing a robust foundation for accurate predictions and performance evaluation. Known values of and redshift serve as ground truth, enabling rigorous testing and validation. Evaluation against both withheld CIGALE simulations and 698 observations from the MIRI EGS Galaxy and AGN (MEGA) survey demonstrates robust performance and generalisability across diverse datasets.
AGNBoost achieves outlier fractions of 0.11 and 0.074 for AGN identification and redshift estimation, respectively, on the MEGA dataset, indicating a high degree of accuracy in both tasks. Researchers report a root mean square error of 0.11 for and a normalised mean absolute deviation of 0.011 for redshift on the mock galaxies, further validating the model’s predictive power.
The modular design of AGNBoost facilitates the easy integration of additional photometric bands and derived parameters, enabling adaptation for the prediction of other variables and expanding its scientific utility. Scientists can readily incorporate new data sources and refine the model’s performance, ensuring its continued relevance as astronomical observations evolve. Its computational efficiency positions it as a suitable tool for analysing the vast datasets generated by modern astronomical instruments, such as the James Webb Space Telescope and the Vera C. Rubin Observatory.
Researchers plan to integrate AGNBoost into existing astronomical data pipelines, making it readily accessible to the broader astronomical community. This will streamline the process of AGN identification and characterisation, allowing astronomers to focus on interpreting the results and addressing new scientific questions. The framework’s user-friendly interface and comprehensive documentation will further enhance its accessibility and usability.
Future work will focus on applying AGNBoost to larger and more diverse datasets, providing a more comprehensive assessment of its capabilities and limitations. Researchers are also exploring the possibility of using AGNBoost to identify and characterise other types of astronomical objects, such as quasars, blazars, and Seyfert galaxies.
The development of AGNBoost was supported by a grant from the National Science Foundation, highlighting the importance of investing in innovative research that advances our understanding of the universe. The project brought together a team of experts in machine learning, astronomy, and data science, demonstrating the power of interdisciplinary collaboration.
The framework’s success demonstrates the potential of machine learning to enhance the field of astronomy, enabling astronomers to tackle complex scientific challenges and unlock new discoveries. AGNBoost serves as a model for future machine learning projects in astronomy, demonstrating the power of combining advanced algorithms with cutting-edge astronomical observations.
👉 More information
🗞 AGNBoost: A Machine Learning Approach to AGN Identification with JWST/NIRCam+MIRI Colors and Photometry
🧠 DOI: https://doi.org/10.48550/arXiv.2506.03130
