The increasing realism of AI-generated videos raises a crucial question: can people reliably detect these fakes, and what specifically gives them away? Xingyu Fu, Siyi Liu, and Yinuo Xu, alongside Pan Lu, Guangqiuse Hu, and Tianbo Yang, address this challenge by introducing a new benchmark called DeeptraceReward, which meticulously maps the visual cues humans use to identify AI-generated videos. This research moves beyond simply classifying videos as ‘fake’ or ‘real’ and instead focuses on pinpointing the specific spatiotemporal artifacts that betray their artificial origins, offering a detailed understanding of how people perceive deepfakes. By consolidating over four thousand detailed annotations across thousands of videos, the team trained a multimodal language model that significantly outperforms existing systems in identifying, localising, and explaining these telltale signs, paving the way for more socially aware and trustworthy video generation technologies.
Existing datasets often lack the detailed information needed to train robust detectors, particularly regarding where and how a video is manipulated. DEEPTRACEREWARD addresses this limitation by providing comprehensive annotations, including bounding boxes highlighting manipulated regions, precise start times indicating when manipulation begins, and clear explanations describing the type of manipulation. Experiments demonstrate that models trained on DEEPTRACEREWARD significantly outperform those trained on existing datasets in detecting deepfakes, advancing the field by providing a more nuanced understanding of deepfake detection and offering a pathway towards more accurate and interpretable systems.
Human Perception of Deepfake Video Flaws
Researchers pioneered a new benchmark, DeeptraceReward, to rigorously assess how humans perceive the authenticity of AI-generated videos. They meticulously gathered over four thousand detailed annotations across more than three thousand high-quality generated videos, pinpointing specific spatiotemporal traces that reveal a video’s artificial origin to human observers. This involved annotators identifying areas of perceived fakeness, providing natural language explanations, and precisely marking the onset and offset of these visual cues. The methodology centers on capturing human-perceived flaws, consolidating annotations into nine major categories of deepfake traces that commonly lead viewers to identify generated content.
Researchers then trained multimodal language models as reward signals, aiming to mimic human judgments in both identifying these traces and accurately localizing them within the video frames. This approach differs significantly from existing benchmarks, which provide holistic scores lacking the granularity to pinpoint specific sources of inauthenticity. Experiments employing a dedicated reward model trained on DeeptraceReward demonstrated substantial performance improvement, outperforming GPT-4 by over 34% on average across fake clue identification, spatial grounding, and temporal labeling. The study revealed a clear difficulty gradient, with binary real versus fake classification proving easier than fine-grained detection of deepfake traces, and performance decreasing as the task moved from natural explanation to spatial grounding and precise temporal labeling. This innovative methodology provides a rigorous testbed and training signal for developing socially aware and trustworthy video generation models, focusing on human perception as a crucial evaluation criterion.
Deepfake Trace Detection and Granular Analysis
The research team introduced DeeptraceReward, a new benchmark designed to rigorously evaluate the ability of humans to identify artificially generated videos and, crucially, to pinpoint where and why those videos appear fake. This work addresses a gap in deepfake detection, moving beyond simply classifying a video as real or fake to understanding the specific visual cues that reveal its artificial origin. The dataset comprises detailed annotations of over four thousand deepfake traces across more than three thousand high-quality generated videos, providing a granular level of analysis previously unavailable. Each annotation includes a bounding box highlighting the region of the video containing the artifact, precise start and end timestamps, and a natural language explanation of the perceived flaw.
These annotations were consolidated into nine major categories of deepfake traces that humans commonly use to identify AI-generated content. The team discovered a clear difficulty gradient in detection, with identifying whether a video is fake being significantly easier than pinpointing the specific deepfake traces within it. Within trace detection, identifying natural explanations for the flaws proved easiest, while accurately pinpointing the spatial location and precise timing of the artifact proved most challenging. Detailed analysis of the dataset reveals significant variation in video resolution and length across different AI generation models. The average video length is approximately six seconds, with an average resolution of 739x 1313 pixels. The newly developed reward model, trained on DeeptraceReward, outperforms GPT-5 by over 34% on average across all tasks, demonstrating a substantial improvement in the ability to identify and localize deepfake traces, promising to advance the development of more socially aware and trustworthy video generation technologies.
DeeptraceReward Reveals Deepfake Detection Weaknesses
Recent advances in video generation have produced increasingly realistic content, yet evaluating these models requires considering how humans perceive authenticity. Researchers have introduced DeeptraceReward, a new benchmark that identifies and annotates specific visual clues, or traces, that reveal a video as machine-generated. This dataset comprises detailed analyses of over three thousand videos, pinpointing the location and timing of these traces, and categorising them into nine major types that contribute to human detection of deepfakes. The work demonstrates that current multimodal language models struggle to identify these subtle traces, highlighting a gap between automated evaluation metrics and human perception. By training a dedicated reward model using DeeptraceReward, the team achieved significant improvements in detecting and localising deepfake traces, surpassing the performance of existing models. Future work could expand the dataset to include a wider range of traces and explore how these traces evolve as video generation technology improves, ultimately driving the development of more human-aligned and trustworthy video generation systems.
👉 More information
🗞 Learning Human-Perceived Fakeness in AI-Generated Videos via Multimodal LLMs
🧠 ArXiv: https://arxiv.org/abs/2509.22646
