Deepfake Detection : A Systematic Literature Review

daniayo·2025년 3월 23일

이캅스

목록 보기
1/4

Deepfake Detection : A Systematic Literature Review


I. INTRODUCTION

The notable advances in artificial neural network (ANN) based technologies play an essential role in tampering with multimedia content.

The term "Deepfake" is derived from "Deep Learning(DL)" and "Fake," and it describes specific photo-realistic video or image contents created with DL's support.

Two neural networks were used to generate such counterfeit videos

  • Generative Network
    • Creates fake images using an encoder and a decoder
  • Discriminative Network
    • Defines the authenticity of the newly gernerated images
*Generative Network (생성 네트워크)Discriminative Network (판별 네트워크)
역할새로운 데이터를 생성데이터를 분류 또는 판별
입력노이즈 또는 조건 데이터실제 데이터
출력생성된 샘플클래스 확률 또는 진짜/가짜 판별
예시GAN의 Generator, VAEGAN의 Discriminator, CNN
  • GAN에서는 생성자(Generator)와 판별자(Discriminator)가 서로 경쟁하며 발전함 → 점점 현실적인 데이터 생성이 가능해짐

Considering the threats and potential risks in privacy vulnerabilities, the study of Deepfake emerged super fast 😭
Apart from Deepfake pornography, there are many other malicious or illegal uses of Deepfake, such as spreading misinformation, creating political instability, or various cybercrimes.

🧑‍💻 : We present a systematic literature review (SLR) on Deepfake detection

Contribution Summary

  • Report current tools, techniques, and datasets for Deepfake detection-related research
  • Introduce a taxonomy that classifies Deepfake detection techniques in four categories
  • Conduct an in-depth analysis of the primary studies' experimental evidence & Evaluate the performance of various Deepfake detection methods
  • Highlight a few observations and deliver some guidelines on Deepfake detection

II. PROCESS OF SLR

Review process is categorized into 3 main stages
identify → evaluate → understand

Planning the Review

  • identify the need | 필요성 확인
  • develop criteria and procedure | 기준과 과정 개발
  • evaluate the criteria and procedure | 기준과 과정 평가

Conducting the Review

Includes six obligatory phases

  • A. Resarch Questions (RQs)
    • Determined the set of RQs in the text of the Deepfake domain
  • B. Search strategy (SS)
    • Tried to establish an unbiased search strategy to detect as much as the relevant literature as possible
  • C. Study Selection Criteria (SSC)
    • Followed careful consideration to ensure fairness in selecting primary studies that provide significant evidence about research questions
  • D. Quality Assessment Criteria (QAC)
    • Developed a set of quality criteria for evaluating individual studies
  • E. Data extraction and monitoring (DEM)
    • Determined how the information required from selected studies would be obtained and record their pieces of evidence
  • F. Data Synthesis (DS)
    • Followed a set of procedures to synthesize information better

Reporting the Review

Report the outcomes in a suitable form to the distribution channel and target audience

  • A. Research Questions (RQs)
    • The right question leads to raising confidence in a domain
    • Defined 4 crucial questions (RQ 1-4) along with some supplementary questions (SRQs)
    • Identify → Investigate → Evaluate → Compare
  • B. Search Strategy (SS)
    • Tried to include all the combinations of related search phrases or keywords to avoid any bias
    • Used Boolean terminology → 'AND' or 'OR'
      • (Deepfake OR FaceSwap OR Video Manipulation OR Fake face/image/video) AND (detection OR detect) OR (Facial Manipulation OR Digital Media Forensics)
    • Secelcted 10 popular repositories
      - include journals, conferences, and archives
      - January 2018 - December 2020
  • C. Study Selection Criterea (SSC)
    • Establish 3 inclusion criteria
      • 제목(title), 초록(abstract), 또는 키워드(keywords) 명시 O
      • 명시 X / 문맥 O
      • 글로 적힘 O
    • Establish 4 exclusion criteria
      • 영어 X
      • 겹치는 것 X
      • 소리나 글자 형태 X
      • 특정 변화 기술 X
  • D. Quality Assessment Criteria (QAC)
    • Assessing the quality of evidence contained within an SLR
      • Selecting appropriate criterion to help analyze strength of evidence and imbedded biases within each paper is also essential
    • Validate, Review, Cross-Checking approach
    • Finalized 91 research articles and 21 additional reviews representing Deepfake detection
  • E. Data Extraction And Monitoring (DEM)
    • Designing systems for the actual extraction of data from the studies
    • Searched 9 popular libraries
      • What entities were or needed to be extracted
      • At least one entity was automatically extracted
        • Authors, publication sources and publication times
        • Analysis techniques
        • Empirical evidence
          • datasets
          • features
          • models or methodologies
          • measurement metrics
  • F. Data Synthesis (DS)
    • Specifically reviews the associated and comparative findings from the data extraction process
    • Accumulate → Analyze → Visualize

III. OUTCOMES

A. Description of Studies

최근 3년 이내의 연구 112편을 수집했다.

1) Publication Period
Deepfake 는 2018년에 나왔다. 따라서, 2018-2020 사이의 출판을 고려하였다.
기간동안 Deepfake 관련 출판물은 급속도로 증가하였다.

2) Source of Publications
Eight different publication sources

Categorized different researches according to the applied techniques and describe them in the following sections.

사용한 기술별로 각 논문들을 정리했고, Deepfake detection에서 어떤 방식을 사용하였는 지 설명해주고 있다.

1) Machine Learning Based Methods

Traditional machine learning (ML) algorithms are instrumental in comprehending the logic for any decision that could be expressed in human terms.

머신러닝 알고리즘은 Deepfake domain이 데이터를 더 잘 잡고 작동하는데 적절하다. Hyper-parameter를 튜닝하고 모델 디자인을 바꾸는 것은 더 다루기가 쉽다.
트리 기반 머신러닝 접근 방식은 의사 결정 과정을 트리 형태로 보여줘서 explainability 문제를 갖지 않는다.

GANs는 비지도 학습을 지도학습 처럼 다루고 실제 사진같은 가짜 얼굴 사진과 비디오를 만드는 생성형 모델을 자동적으로 train하는데 사용된다. 몇몇 ML-based 방법들은 특정 불균형을 보여준다.

사람 얼굴을 합성하는 Deepfake에는 여러 방법들이 있는데, 대부분의 기술들은 눈 그림자, 귀걸이 등등 특정 얼굴의 부분을 다룬다.
그런 단일 feature를 건들이는 방법은 조작된 부분을 구별하고 발견하는데 어려움이 있다.
이것을 극복하기 위해 그러한 feature들을 한 번에 묶는 기술이 제안되었다.

GANs로 생성된 사진과 비디오가 진짜임을 입증할 특이한 features 같은게 있다.
머리의 움직임은 보통 얼굴 표정과 연관되어있다. MLP 기술은 조금의 계산으로 시각적 요소를 이용해서 비디오가 Deepfake인지 판별한다.

이 방법은 98%의 Deepfake detect 정확도를 가지지만, dataset에 따라 크게 좌지우지 된다. 80%의 train set, 20%dml test set.

2) Deep Learning Based Methods

In the case of Deepfake detection in images, there are plenty of works where deep learning-based methods are applied to detect specific artifacts generated by their generation pipeline.

아래에는 Deep Learning 관련 논문들에 대한 설명들이 자세하게 적혀있지만, 나의 목적은 그것이 아니기 때문에 요약정리하지는 않겠다.

We observed that many approaches were proposed to apply frame-by-frame analysis in videos or images to manipulate face and track facial movement to obtain better performance.
많은 방법들이 동영상이나 이미지의 프레임별 분석을 적용하는 방식을 채택했다.

하지만 대부분의 방법들은 overfitting 문제가 존재했다.

3) Statistical Measurements Based Methods

Determining different statistical measures such as average normalized cross-correlation scores between original and suspected data helps to understand the originality of the data.

가장 기본적인 생성형 Convolutional structure를 모델링하기 위해서 사용한 방법 중 하나는 지엽적인 특징들을 Expectation-Maximization (EM) 알고리즘을 사용하여 추출하는 것이었다. 추출 이후에는 ad-hoc 입증을 아키텍쳐에 적용한다.

보통 GAN의 정확도가 낮으면 distance가 증가했다. 굉장히 정확한 GAN은 의무적으로 감지하기 힘든 높은 기술로 처리된 사진들을 생성해야한다.

4) BlockChain Based Methods

Blockchain technology provides various features that can verify the legitimacy and provenance of digital content in a highly trusted, secured, and decentralized manner.

딥페이크 Detection으로는, 공개된 Blockchian이 비디오나 이미지가 진짜인지를 분산적인 방법으로 가장 적절한 기술적 해결책으로 여겨지고 있다.

Multiple LSTM networks are being used as a deep encoder for creating discriminating features, which are then compressed and used to hash the transaction.


^ Categories of Deepfake detection strategies (quantity and percentage of related categories of studies)

알 수 있는 것은 Deep learning- based approach가 Deepfake를 감지하기 위해서 가장 많이 쓰이는 기술이다.

C. RQ-2 : What Is The Way To Perform Empirical Tests To Detect Deepfake Using These Studies?

To Provide an answer to RQ-2, we review the different experimental methods in-depth and categorize the overall Deepfake detection process into six distinct stages that are summarized below

  • Data Collection, Face Detection, Feature Extraction, Feature Selection, Model Selection, Model Validation

1) SRQ-2.1 : What Datasets Are Typically Used In Deepfake Detection Experiments?

Deepfake datasets used in numerous studies for training and testing purposes.

2) SRQ-2.2 : What Features Are Typically Utilized In Detecting Deepfake?

  • 21 : Special artifacts-based features generated by various editing processes
  • 20 : Texture and Spatio-temporal consistent features
  • 14 : Facial landmarks-based features
  • 13 : Artifacts-based elements

The study shows that special artifacts-based features, face landmarks, and Spatio-temporal features are used widely to detect Deepfakes

3) SRQ-2.3 : What Models Are Used To Detect Deepfake Manipulation?

This segment describes various models that are used for detecting Deepfake.

3가지의 그룹으로 분류하였다.

  • (i) Deep Learning Model
    • Feature extraction과 Selection mechanism으로 많이 사용되었다.
    • 그들은 데이터로부터 feature를 즉시 추출하거나 배울 수 있다.
  • (ii) Machine Learning Model
    • 높은 성능의 특징 선택 알고리즘을 사용하여 정확한 특징을 정의함으로써 특징 벡터를 생성한다.
    • 이후 이러한 벡터를 Deepfake로 조작된 비디오나 사진인지 구분하는 용도의 classifier를 훈련시키는 입력으로 사용한다.
  • (iii) Statistical Model
    • validation을 위하여 정보 이론 연구의 사용에 근거한다.
    • Original 과 Deepfake 를 구분하는데 가장 짧은 path가 계산된다.

DL-based studies hold the highest proportion of SLR

The deep neural network (DNN) models are successful in Deepfake detection, where CNN-based mdoels demonstrate more efficiency among all the DNN models.

4) SRQ-2.4 : What Measurement Metrics Are Used For Computing The Performance Of Deepfake Detection Methods?

This section describes various measurement metrics applied for assessing the models' performance in detecting such Deepfakes.

  • TP : the number of Deepfakes that are correctly predicted as Deepfake
  • TN : the number of actual images/videos correctly predicted as real
  • FP : the nubmer of real images/videos incorrectly predicted as Deepfake
  • FN : the number of Deepfakes incorrectly predicted as the real

D. RQ-3 : What Is The Classification Framework For Deepfake Detection Approaches?

This section classified overall approaches concerning different elements such as input data, features, method categories, and type of techniques.

E. RQ-4 : What Is The General Efficiency Of A Variety Of Deepfake Detection Strategies Based On Experimental Proof?

This segment attempts to decide the efficacy of Deepfake detection models.

→ Based on the overall results, we found deep learning-based techniques are efficient for detecting Deepfake.

F. RQ-5 : Is The Efficiency Of Deep Learning Models Better Than Non-Deep Learning Models In Deepfake Detection Based On Experimental Results?

We determined the mean accuracy, AUC, recall, and precision.
Next, we apply a comparative analysis of these two models' performance and obtain an average result.

→ The overall results demonstrate the superiority of deep learning-based models over non-deep learning-based models.

IV. OBSERVATIONS

A. Combining Different Deep Learning Methods Is Critical For The Accurate Deepfake Detection

Based on the review, we see that multiple strategies are applied using numerous features.

최근의 자료들은 (특히 CNN models) 어떻게 기계적으로, 혹은 즉각적으로 Deepfake를 감지할 수 있고 선택할 수 있는 특징을 배우게 하기위해 딥러닝 기반 접근을 적용했다.

examples

  • two-phase CNN method
    • First, extracts particular features among counterfeit and acutal images by incorporating various dense units.
    • Second, uses these features to train the proposed CNN to classify the input images.

Deepfake detecting에 사진과 동영상은 같은 기술을 적용하기 어렵다.
동영상은 시간적인 요소도 사용하기 때문에!

  • recurrnet covolutional model (RCN)
    • Proposed to use spatiotemporal features of videos for detecting Deepfakes
  • CNN + LSTM
    • CNN : handles extracting the frame-level features
    • LSTM : use these features as input to generate a descriptor accountable for analyzing the temporal sequence
  • long-term recurrent convolutional network (LRCN)
    • total eye blinking of an individual in Deepfake videos is always lower than in real videos
  • deep ensemble learning strategy (DeepfakeStack)
    • train a meta-learner to top base-learners with pre-trained experience.

여러 개의 딥러닝 방법들을 합치는 것은 단일사용 방식보다 더 향상된 결과를 유지한다.

Therefore, it may be appropriate to explore the compatibility of deep learnig methods and integrate some of them for further progress in Deepfake detection.

Applying deep learning algorithms to detect Deepfake from SRQ-2.3 has become a hot subject.
Also, most studies follow a traditional CNN approach to classify Deepfake in the deep-learning environment.

결론!! 딥러닝 기반 방식을 Deepfake를 감지하는데 추천한다!

C. A Unique Framework Is Required For The Fair Evaluation Of Different Heterogeneous Deepfake Detection Methods

(1) The measurement metrics used in the studies in question are not standard

(2) It is also seen that the dataset's size is not consistent

(3) The initial videos in those experiments are hardly available in public

결론!! Creating a unique framework for the fair assessment of the performance is essential

V. LIMITATIONS AND CHALLENGES

A. Construct Validity

논문을 작성하면서 열심히 찾았지만, 몇몇 논문들이 포함이 안되었을 수도 있다. 게다가, 논문 분류에 몇몇 실수가 존재할 수도 있다.

B. Internal Validity

데이터 추출 & 분석과 관련이 있다.
Errors may still be present in how we collected and processed data

C. External Validity

결과의 요약과 관련이 있다.
More Deepfake detection experiments might be required to be obtained to produce definitive and systematic outcomes

VI. CONCLUSION

  • The deep learning-based methods are widely used in detecting Deepfake.
  • In the experiments,the FF++ dataset occupies the largest proportion.
  • The deep learning (mainly CNN) models hold a significant percentage of all the models
  • The most widely used performance metric is detection accuracy
  • Th experimental results demonstrate that deep learning techniques are effective in detecting Deepfake

profile
댜니에요

0개의 댓글