Researchers are tackling the persistent problem of limited data hindering accurate archaeological artefact classification, specifically concerning rare Chinese porcelain. Ziyao Ling, Silvia Mirri, and Paola Salomoni, from the Department of Computer Science and Engineering at the University of Bologna, alongside Giovanni Delnevo et al., demonstrate a novel approach using synthetic image generation via Stable Diffusion to bolster existing datasets. Their study rigorously tests whether these artificially created images can improve multi-task CNN performance in identifying porcelain dynasty, glaze, kiln and type , achieving a notable 5.5% F1-macro increase in type classification with a 90:10 real-synthetic data mix. This work is significant as it provides practical guidance for integrating generative models into archaeological research, carefully balancing the need for data diversity with the crucial requirement of maintaining archaeological authenticity.
Synthetic Data Improves Porcelain Classification Performance significantly
Scientists have shown that synthetic data augmentation can meaningfully improve deep learning performance in Chinese porcelain analysis, though the gains vary by task. Classification achieved the largest improvement, with a 5.5% increase in F1-macro score at a 90:10 real-to-synthetic data ratio, while dynasty and kiln identification saw more modest gains of 3–4%. These results indicate that the effectiveness of synthetic augmentation depends strongly on how well the generated images align with task-relevant visual features. This is particularly important for Chinese porcelain, a key component of cultural heritage whose authentication still relies heavily on expert connoisseurship, with scientific techniques playing only a supplementary role. Current practices face major challenges, including the subjectivity and time intensity of manual assessment, as well as the high cost and potential destructiveness of advanced analytical methods.
Deep learning, especially convolutional neural networks, has demonstrated strong potential for supporting porcelain authentication by learning visual cues such as decorative motifs, form, glaze, and historical period. However, progress is severely constrained by limited data availability. Existing datasets are small by deep learning standards, leading to overfitting and poor generalisation, especially given the high intra-class variability of porcelain artifacts. Traditional data augmentation methods, such as rotation or color jittering, merely transform existing pixels and fail to introduce new, meaningful variations. This limitation is critical in archaeology, where a single porcelain type can exhibit substantial differences in glaze texture, crackle patterns, colour gradients, and surface finish that cannot be captured through simple geometric or photometric transformations.
To address these challenges, researchers are increasingly turning to generative models capable of synthesising new, photorealistic images. Recent advances in diffusion models have enabled the creation of high-quality synthetic images that preserve archaeological plausibility while introducing controlled diversity. In this study, diffusion-based image generation was applied to augment a multi-task CNN for Chinese porcelain classification. Using Stable Diffusion with LoRA and carefully engineered prompts grounded in archaeological documentation, models trained on mixed real–synthetic datasets (95:5 and 90:10 ratios) consistently outperformed those trained on real data alone. Task-level analysis revealed that type classification benefited the most from synthetic augmentation, with improvements of up to 4%, demonstrating the importance of task-specific alignment between generated imagery and visual discriminators.
The study further highlights the broader relevance of generative AI for cultural heritage. While earlier GAN-based approaches struggled with authenticity, diffusion models offer improved fidelity and controllability, making them well suited for heritage applications where realism and historical accuracy are essential. Nevertheless, archaeological porcelains pose unique challenges due to strict constraints on form, glaze, decoration, and historical context. The findings underscore the necessity of embedding domain knowledge into the generation process through structured prompts and evaluation frameworks. Overall, this work demonstrates that diffusion-based synthetic augmentation is a viable and effective strategy for overcoming data scarcity in porcelain research, contributing both to advances in computer vision and to the sustainable digital preservation of cultural heritage.
LoRA Augmentation Boosts Porcelain Classification Accuracy to new levels
Scientists achieved a 5.5% F1-macro increase in type classification accuracy by augmenting limited real datasets with synthetic images generated using Stable Diffusion with Low-Rank Adaptation (LoRA). The study focused on multi-task CNN-based porcelain classification, addressing the challenge of scarce training data in archaeological applications. Researchers conducted controlled experiments using MobileNetV3 with transfer learning, comparing models trained on real data alone versus mixed real-synthetic datasets with ratios of 95:5 and 90:10. These experiments meticulously evaluated performance across four key porcelain classification tasks: dynasty, glaze, kiln, and type identification, providing a detailed quantitative analysis of the augmentation strategy.
Results demonstrate task-specific improvements, with type classification benefiting most significantly from the synthetic data augmentation. Specifically, the 90:10 real-synthetic ratio yielded the 5.5% F1-macro increase, indicating a substantial enhancement in the model’s ability to accurately categorise porcelain types. Dynasty and kiln tasks exhibited more modest gains of 3-4%, suggesting the effectiveness of synthetic augmentation is dependent on the alignment between generated features and task-relevant visual signatures. The team measured performance using the F1-macro metric, a standard measure of classification accuracy that considers both precision and recall across all classes.
Experiments revealed that the synthetic data effectively expanded the feature space, particularly for the type classification task, allowing the model to better discriminate between subtle visual differences. Data shows that the 95:5 ratio also provided some improvement, though to a lesser extent than the 90:10 ratio, suggesting a balance must be struck between real and synthetic data to avoid introducing noise or bias. Measurements confirm that the generated images, while not perfect replicas of real porcelain, provided sufficient visual diversity to improve model generalization.
👉 More information
🗞 Synthetic Data Augmentation for Multi-Task Chinese Porcelain Classification: A Stable Diffusion Approach
🧠 ArXiv: https://arxiv.org/abs/2601.14791
