Determining the distances to galaxies is fundamental to understanding the universe’s evolution, and scientists increasingly rely on photometric redshift estimation to map the cosmos. Grant Merz, Ming-Yang Zhuang, and colleagues from the University of Illinois at Urbana-Champaign, alongside Qian Yang from the Center for Astrophysics | Harvard and Smithsonian, now present a novel approach to this challenge using a deep learning algorithm called DeepDISC. The team demonstrates that DeepDISC accurately estimates distances from images obtained by the James Webb Space Telescope, achieving performance comparable to, and in some cases exceeding, traditional template fitting methods. This advancement is significant because DeepDISC operates directly on image pixels, bypassing the need for pre-calculated photometric measurements and enabling rapid processing of vast astronomical datasets, the team generated a catalog of 94,000 distance estimates in just four minutes using a single graphics processing unit. This work, which includes contributions from Junyao Li, Yue Shen, and Xin Liu, paves the way for efficient analysis of the ever-growing volumes of data from JWST and future surveys.
Deep Learning for JADES Galaxy Redshifts
This research explores the use of deep learning, specifically convolutional neural networks like ResNet50 and vision transformers like MViTv2, to estimate the distances to galaxies from images. Researchers are investigating whether pre-training the networks on related datasets improves their performance when analyzing data from the JADES survey. The study utilizes Mutual Information to quantify the relationship between the network’s learned features and the true redshift of galaxies, assessing how effectively the networks learn relevant information from the images. The study compares ResNet50, a standard convolutional neural network, and MViTv2, a more recent vision transformer that uses attention mechanisms.
The networks are initially pre-trained on datasets like ImageNet and a galaxy morphology learning dataset before being fine-tuned on the JADES data. Researchers employ UMAP to visualize the complex information learned by the networks, qualitatively assessing whether the features correlate with redshift. Mutual Information serves as the primary quantitative metric, with higher values indicating a stronger relationship between the network’s features and a galaxy’s redshift. Results demonstrate that most models, except those pre-trained on ImageNet, exhibit structure in their feature space that correlates with redshift, suggesting they are learning relevant features.
Pre-training generally does not significantly improve MViTv2 models, while the ResNet50 model benefits from pre-training on the galaxy morphology dataset, gaining a noticeable increase in mutual information. MViTv2 models, while capable of generalizing to new datasets, do not extract as much useful information for redshift estimation as a well-pretrained ResNet50. The relatively small size of the JADES training set likely limits the performance of transformer models, which typically require larger datasets to learn effectively. The authors suggest that the weaker inductive biases of transformers, compared to the convolutional biases of CNNs, make them more reliant on large datasets.
While transformers are powerful, they may struggle to extract the most relevant information from small datasets. The success of pre-training with the galaxy morphology dataset for ResNet50 highlights the importance of providing prior knowledge to the network, especially when dealing with limited data. This innovative approach bypasses the traditional need for measuring the brightness of objects through filters, instead analyzing pixel-level data to determine redshifts, a crucial indicator of distance. The team trained and validated the DeepDISC model using both simulated data from the JAGUAR catalogs and real observations from the JADES program in the GOODS-S field, benefiting from a wealth of spectroscopically confirmed redshifts. Results demonstrate that DeepDISC achieves comparable accuracy to traditional methods when using the same filters, and even outperforms them in certain scenarios.
The team compiled a catalog of 94,000 redshift estimates in approximately 4 minutes on a single A40 GPU, showcasing the method’s computational efficiency. To facilitate training and testing, the team partitioned the imaging data into sub-images, each containing object locations, segmentation maps, bounding boxes, and redshifts. The research highlights the potential of this image-based approach for analyzing increasingly large volumes of data from JWST and future missions, offering a powerful new tool for understanding the early universe. Researchers have demonstrated that this holistic neural network framework accurately estimates redshifts up to approximately z=8, by directly analyzing images to detect, deblend, and classify sources. When provided with comparable filter sets, DeepDISC outperforms established template fitting methods like EAZY, achieving lower scatter and fewer outliers in its redshift estimates.
Deep Learning Estimates Redshift From Raw Pixels
This study pioneers a new method for estimating the distances to galaxies using a deep learning technique called DeepDISC, Detection, Instance Segmentation and Classification with Deep Learning. Researchers directly analyze raw pixel data from NIRCam images obtained by the JWST Advanced Deep Extragalactic Survey (JADES) program, bypassing traditional methods that rely on pre-measured photometry. DeepDISC first detects and separates individual sources within the images, then classifies each source to estimate its redshift and associated probability, delivering a full redshift probability density function for each object and enabling rigorous uncertainty quantification. To train and validate DeepDISC, scientists compiled a catalog of spectroscopic redshifts, serving as ground truth for the machine learning algorithm.
The team then implemented DeepDISC on NIRCam-only images, demonstrating its ability to produce reliable redshift estimates and uncertainties comparable to those achieved with established template fitting methods using both HST and JWST filters. Notably, DeepDISC outperformed template fitting when tested with matched input filters, achieving lower scatter and fewer outlier redshift estimates. The system generated a catalog of 94,000 redshift estimates in just 4 minutes on a single NVIDIA A40 GPU, showcasing its efficiency. The study meticulously assessed the impact of training data quality on the accuracy of redshift estimates.
Researchers acknowledge the current limitations of spectroscopic training sets, which are relatively small and incomplete. Despite these limitations, the work demonstrates the potential of DeepDISC to handle increasingly large image volumes and benefit from expanding spectroscopic samples from ongoing and future programs. As a result, the team produced a comprehensive catalog of redshift estimates for all JADES DR2 photometric sources in the GOOD-S field, including quality flags to indicate potential caveats and uncertainties.
👉 More information
🗞 Photometric Redshifts in JWST Deep Fields: A Pixel-Based Alternative with DeepDISC
🧠 ArXiv: https://arxiv.org/abs/2510.27032
