Real-Time Deepfake Detection in the Real-World

Real-Time Deepfake Detection in the Real-World

The Hebrew University of Jerusalem

Locally Aware Deepfake Detection Algorithm (LaDeDa)

By limiting its receptive field to qxq pixels, LaDeDa yields a deepfake score for each qxq patch. The image-level deepfake score is the global pooling of the patches scores. We use binary cross entropy loss between the image label and its deepfake score.

Abstract

Recent improvements in generative AI made synthesizing fake images easy; as they can be used to cause harm, it is crucial to develop accurate techniques to identify them. This paper introduces "Locally Aware Deepfake Detection Algorithm" (LaDeDa), that accepts a single 9x9 image patch and outputs its deepfake score. The image deepfake score is the pooled score of its patches. With merely patch-level information, LaDeDa significantly improves over the state-of-the-art, achieving around 99\% mAP on current benchmarks. Owing to the patch-level structure of LaDeDa, we hypothesize that the generation artifacts can be detected by a simple model. We therefore distill LaDeDa into Tiny-LaDeDa, a highly efficient model consisting of only 4 convolutional layers. Remarkably, Tiny-LaDeDa has 375x fewer FLOPs and is 10,000x more parameter-efficient than LaDeDa, allowing it to run efficiently on edge devices with a minor decrease in accuracy. These almost-perfect scores raise the question: is the task of deepfake detection close to being solved? Perhaps surprisingly, our investigation reveals that current training protocols prevent methods from generalizing to real-world deepfakes extracted from social media. To address this issue, we introduce WildRF, a new deepfake detection dataset curated from several popular social networks. Our method achieves the top performance of 93.7% mAP on WildRF, however the large gap from perfect accuracy shows that reliable real-world deepfake detection is still unsolved.

Tiny-LaDeDa

Since LaDeDa focuses on small patches, we hypothesized that a very simple model may be sufficient for detecting deepfake artifacts. We therefore designed Tiny-LaDeDa, a highly efficient model consisting of only 4 convolutional layers. Remarkably, Tiny-LaDeDa achieves superior compute efficiency compared to other SoTA methods with minor accuracy trade-off.

Tiny-LaDeDa

Tiny-LaDeDa Distillation. To train Tiny-LaDeDa we perform logit-based distillation using the patch-level deepfake scores predicted by LaDeDa (the teacher).

Performance vs. Efficiency trade-off. SoTA methods comparison of average precision (AP) performance on real-world data as a function of floating point operations per second (FLOPs) at inference time.

Aligning Deepfake Evaluation with the Real-World

With LaDeDa and Tiny-LaDeDa achieving near-perfect scores on current deepfake detection benchmarks, one could ask if the task is close to being solved. However, we found that standard training protocols prevent SoTA methods (including ours) to detect real-world deepfakes, taken from popular social platforms. To tackle this, we introduce WildRF, a new deepfake detection dataset curated from popular social networks: Reddit, X (Twitter) and Facebook. We validated WildRF's effectiveness by retraining SoTA methods on this real-world data, significantly improving their performance compared to training with standard protocols.

WildRF

WildRF Overview. A realistic benchmark consisting of images sourced from popular social platforms: Reddit, X (Twitter) and Facebook. WildRF captures high variability in a range of attributes including image resolutions, formats, semantic content, generation techniques and edits encountered in-the-wild.

BibTeX

Coming soon