Multi-Frequency Fusion for Robust Video Face Forgery Detection
AuthorsMeng Cao, Yang Lu, Simon Wang, Jiulong Shan, Jun Qin†**, Steve Baek‡, Bhiksha Raj‡
Multi-Frequency Fusion for Robust Video Face Forgery Detection
AuthorsMeng Cao, Yang Lu, Simon Wang, Jiulong Shan, Jun Qin†**, Steve Baek‡, Bhiksha Raj‡
Current face video forgery detectors use wide or dual-stream backbones. We show that a single, lightweight fusion of two handcrafted cues can achieve higher accuracy with a much smaller model. Based on the Xception baseline model (21.9 million parameters), we build two detectors: LFWS, which adds a 1x1 convolution to combine a low-frequency Wavelet-Denoised Feature (WDF) with the phase-only Spatial-Phase Shallow Learning (SPSL) map, and LFWL, which merges WDF with Local Binary Patterns (LBP) in the same way. This extra module adds only 292 parameters, keeping the total at 21.9 million—smaller than F3Net (22.5 million) and less than half the size of SRM (55.3 million). Even with this minimal overhead, the fused models increase the average area under the curve (AUC) from 74.8% to 78.6% on FaceForensics++ and from 70.5% to 74.9% on DFDC-Preview, gains of 3.8% and 4.4% over the Xception baseline. They also consistently outperform F3Net, SRM, and SPSL in eight public benchmarks, without extra data or test-time augmentation. These results show that carefully paired, handcrafted features, combined through the lightweight fusion block, can provide state-of-the-art robustness at a significantly lower cost. Our findings suggest a need to reevaluate scale-driven design choices in face video forgery detection.
Neural Face Video Compression using Multiple Views
June 6, 2022research area Computer VisionWorkshop at CVPR
Recent advances in deep generative models led to the development of neural face video compression codecs that use an order of magnitude less bandwidth than engineered codecs. These neural codecs reconstruct the current frame by warping a source frame and using a generative model to compensate for imperfections in the warped source frame. Thereby, the warp is encoded and transmitted using a small number of keypoints rather than a dense flow field,…
An On-device Deep Neural Network for Face Detection
November 16, 2017research area Computer Vision, research area Methods and Algorithms
Apple started using deep learning for face detection in iOS 10. With the release of the Vision framework, developers can now use this technology and many other computer vision algorithms in their apps. We faced significant challenges in developing the framework so that we could preserve user privacy and run efficiently on-device. This article discusses these challenges and describes the face detection algorithm.