논문 리스트 정리

김민상·2025년 9월 8일

Is It Overkill? Analyzing Feature-Space Concept Drift in Malware Detectors, 2023 IEEE Security and Privacy Workshops (SPW)

Concept drift is a major challenge faced by machine learning-based malware detectors when deployed in practice. While existing works have investigated methods to detect concept drift, it is not yet well understood regarding the main causes behind the drift. In this paper, we design experiments to empirically analyze the impact of feature-space drift (new features introduced by new samples) and compare it with data-space drift (data distribution shift over existing features). Surprisingly, we find that data-space drift is the dominating contributor to the model degradation over time while feature-space drift has little to no impact. This is consistently observed over both Android and PE malware detectors, with different feature types and feature engineering methods, across different settings. We further validate this observation with recent online learning based malware detectors that incrementally update the feature space. Our result indicates the possibility of handling concept drift without frequent feature updating, and we further discuss the open questions for future research.
=>
Concept drift 영향은 feature-space drift가 아닌, data-space drift에 큰 영향을 받는다.

MalCL: Leveraging GAN-Based Generative Replay to Combat Catastrophic Forgetting in Malware Classification, arXiv, AAAI

Continual Learning (CL) for malware classification tackles the rapidly evolving nature of malware threats and the frequent emergence of new types. Generative Replay (GR)-based CL systems utilize a generative model to produce synthetic versions of past data, which are then combined with new data to retrain the primary model. Traditional machine learning techniques in this domain often struggle with catastrophic forgetting, where a model's performance on old data degrades over time.
In this paper, we introduce a GR-based CL system that employs Generative Adversarial Networks (GANs) with feature matching loss to generate high-quality malware samples. Additionally, we implement innovative selection schemes for replay samples based on the model's hidden representations.
Our comprehensive evaluation across Windows and Android malware datasets in a class-incremental learning scenario -- where new classes are introduced continuously over multiple tasks -- demonstrates substantial performance improvements over previous methods. For example, our system achieves an average accuracy of 55% on Windows malware samples, significantly outperforming other GR-based models by 28%. This study provides practical insights for advancing GR-based malware classification systems. The implementation is available at \url {this https URL}\footnote{The code will be made public upon the presentation of the paper}.
=>
보통 CL에서는,
새로운 데이터가 추가됨에 따라 catastrophic forgetting으로 인해서 오래된 데이터에 대해선 모델의 성능 저하가 발생함.
그럴수밖에 없는게 자원은 무한한게 아니며, incremental 상황에서는 계속해서 데이터가 축적되는데 무한히 축적할순 없음.
그래서 오래된 데이터 순서대로 삭제를 하는게 일반적.
이를 해결하기 위해 GR-based CL 사용.
여기서 GR-based CL은 GAN을 사용해 generative replay를 수행.
이를 통해 이전 데이터 샘플들을 대표하는 데이터를 일정 수량만큼 재생성하여 새로운 데이터와 학습.

Fighting Fire with Fire: Continuous Attack for Adversarial Android Malware Detection, usenix security symposium

The pervasive adoption of Android as the leading operating system, due to its open-source nature, has simultaneously rendered it a prime target for malicious software attacks. In response, various learning-based Android malware detectors (AMDs) have been developed, achieving notable success in malware identification. However, these detectors are increasingly compromised by adversarial examples (AEs), which are subtly modified inputs designed to evade detection while maintaining malicious functionality. Recently, advanced adversarial example generation tools have been introduced that can reduce the efficacy of popular detectors to 1%. In this background, to address the critical need for more resilient AMDs, we propose a novel defense mechanism, Harnessing Attack Generativity for Defense Enhancement, i.e., HagDe. HagDe involves applying iterative perturbations in the direction of gradient ascent to all samples, aiming to exploit the high sensitivity of AEs to perturbations. This method enables the detection of adversarial samples by observing the disproportionate increase in the loss function following minor perturbations, distinguishing them from regular samples. To evaluate HagDe, we conduct an extensive evaluation on 15,000 samples and 15 different attack combinations. The experimental results show that ourtool can achieve a defense effectiveness of 88.5% on AdvDroidZero and 90.7% on BagAmmo, representing an increase of 32.45% and 11.28%, respectively, compared to the latest defense method KD_BU and LID.
=>
AMDs가 다양한 멀웨어를 탐지하는데 효율적이지만,
AE가 증가함으로써 그 성능이 심하면 1%까지 떨어진다.
이를 해결하기 위해 HagDe를 제안한다.
HagDe는 loss function의 불균형적인 증가를 통해 해당 샘플이 AE인지 아닌지 판단하고 이러한 샘플들을 거른다.
내 생각에 loss function의 불균형은,
일반 샘플에 대해선 약간의 변형을 가했을때 분류 경계로부터 안전하기에 크게 변하지 않는데 GE는 분류경계에 걸쳐 있기에 조금만 변형을 가해도 그 loss function의 증가 정도가 상당하다.
이걸 보고 판단하는 것 같다.

Detecting and Mitigating Sampling Bias in Cybersecurity with Unlabeled Data, usenix security symposium

Machine Learning (ML) based systems have demonstrated remarkable success in addressing various challenges within the ever-evolving cybersecurity landscape, particularly in the domain of malware detection/classification. However, a notable performance gap becomes evident when such classifiers are deployed in production. This discrepancy, often observed between accuracy scores reported in research papers and their real-world deployments, can be largely attributed to sampling bias. Intuitively, the data distribution in the production differs from that of training resulting in reduced performance of the classifier. How to deal with such sampling bias is an important problem in cybersecurity practice. In this paper, we propose principled approaches to detect and mitigate the adverse effects of sampling bias. First, we propose two simple and intuitive algorithms based on domain discrimination and distribution of k-th nearest neighbor distance to detect discrepancies between training and production data distributions. Second, we propose two algorithms based on the self-training paradigm to alleviate the impact of sampling bias. Our approaches are inspired by domain adaptation and judiciously harness the unlabeled data for enhancing the generalizability of ML classifiers. Critically, our approach does not require any modifications to the classifiers themselves, thus ensuring seamless integration into existing deployments. We conducted extensive experiments on four diverse datasets from malware, web domains, and intrusion detection. In an adversarial setting with large sampling bias, our proposed algorithms can improve the F-score by as much as 10-16 percentage points. Concretely, the F-score of a malware classifier on AndroZoo dataset increases from 0.83 to 0.937.
=>
여기서 주의할건 HC에서는 concept drift 감소 목적이며,
본 논문은 sampling bias 감소 목적이다.

concept drift는 점진적으로 데이터 분포가 변하는 것이며,
sampling bias는, 훈련 데이터는 Androzoo인데, 테스트 데이터는 Androzoo + EMBER와 같이 분포 자체가 완전히 변하는 방식이다.

본 논문에서 sampling bias 줄이기 위해 두가지 절차로 진행.
(1) 편향 탐지: 도메인 판별, k-최근접 이웃 거리 분포

도메인 판별이란, 훈련 데이터를 통한 임베딩, 실제 데이터를 통한 임베딩을 비교함으로써, 훈련 데이터와 실제 데이터간의 분포가 많이 다르다면 편향이 있는것이고, 두 분포간 차이가 없다면 편향이 없는것이다.
이 분포간 차이는 분류기가 판단하는데 간단히 분류기에 훈련데이터에 대해서 0으로 설정하고, 테스트 데이터에 대해서 1으로 설정하여, 0과 1을 판단하는게 0.5에 가까우면 편향이 없다고 하고, 1에 가까우면 편향이 있다고 판단한다. 1에 가깝단건 두 분포가 확실히 다르다는거니, 훈련데이터와 테스트 데이터의 분포가 다르다는거니, 데이터에 편향이 있다는 뜻이기도하다.
k-최근접 이웃 거리 분포는, 만약 훈련 데이터가 샘플간 거리가 1이라고 할때, 테스트 데이터는 샘플간 거리가 2, 3 이런식일때 문제가 있다. 만약 테스트 데이터 분포가 훈련 데이터의 분포와 일치한다면, 훈련데이터 샘플간 거리와 같을것이다.

(2) 편향 완화: self-traing 활용

CONL-BM(Contrastive Bias Mitigation), CYC-BM(Cycle-Consistency Bias Mitigation)을 통해 샘플링 바이어스 완화
(1)CONL-BM
먼저 D_T 에서 클래스별 프로토타입(centroid)을 만든다. D_U 샘플을 프로토타입과 비교해서, 가장 가까운 클래스에 pseudo-label을 붙인다. 이 pseudo-label을 기반으로 contrastive learning을 돌려 embedding 공간을 조정한다.
(2)CYC-BM
𝐷_T 로 학습된 분류기 𝐶_T 가 𝐷_U 샘플을 pseudo-labeling. 𝐷_U 로 학습된 분류기 𝐶_U 가 다시 𝐷_T 샘플을 pseudo-labeling. 두 분류기의 예측을 cycle-consistency 조건으로 묶어서, encoder를 업데이트.

Combating Concept Drift with Explanatory Detection and Adaptation for Android Malware Classification, arXiv, ACM CCS

Machine learning-based Android malware classifiers achieve high accuracy in stationary environments but struggle with concept drift. The rapid evolution of malware, especially with new fam- ilies, can depress classification accuracy to near-random levels. Previous research has largely centered on detecting drift samples, with expert-led label revisions on these samples to guide model retraining. However, these methods often lack a comprehensive un- derstanding of malware concepts and provide limited guidance for effective drift adaptation, leading to unstable detection performance and high human labeling costs.
=>
이전 작업들은 concept-drift 현상에 대응하기 위해 drifting sample을 찾는거에 집중했다 (설명 없이).
이전 작업들처럼 드리프팅 샘플 찾고 analyst가 0,1과 같이 단순 라벨링만 하게된다면,
analyst의 가이던스를 충분히 효과적으로 적용하지 못한다.
모델도, 블랙박스 형식으로는 해당 샘플을 훈련하긴 하지만, 만약 analyst가 직접적인 정의를 내려준다면 그에 따른 feature가 따로 생기기에, 혹은 그에 따라 클러스터링이 생기기에 훨씬 효율적일것이다.
그니까, 이 논문이 제안하는 핵심은,
analyst가 drifting sample을 라벨링만 하기에는 너무 효율적이지 못하다.
어차피 분석 한거 해당 멀웨어 샘플의 행동 양상도 정의를 내려주자. 라는것 같다.

To combat concept drift, we propose Dream, a novel system that improves drift detection and establishes an explanatory adaptation process. Our core idea is to integrate classifier and expert knowl- edge within a unified model. To achieve this, we embed malware explanations (or concepts) within the latent space of a contrastive autoencoder, while constraining sample reconstruction based on classifier predictions. This approach enhances classifier retraining in two key ways: 1) capturing the target classifier’s characteristics to select more effective samples in drift detection and 2) enabling concept revisions that extend the classifier’s semantics to provide stronger guidance for adaptation. Additionally, Dream eliminates reliance on training data during real-time drift detection and pro- vides a behavior-based drift explainer to support concept revision. Our evaluation shows that Dream effectively improves the drift detection accuracy and reduces the expert analysis effort in adap- tation across different malware datasets and classifiers. Notably, when updating a widely-used Drebin classifier, Dream achieves the same accuracy with 76.6% fewer newly labeled samples compared to the best existing methods.
=>
이러한 한계를 해결하기 위해, Dream을 제안 (탐지 성능과, 설명적인 적응 프로세스 포함).
classifier와 전문 지식을 하나의 모델에 합쳐서 더 잘 분류할수 있게 해줌.

이를 달성하기 위해 malware explanation을 해주고 (cae 임베딩 상에서),
샘플 재생성에 재약을 가해준다 (그냥 recon loss 사용한다는 말인듯).

여기서 계속 말하는 explanation은 참고로 CADE의 explanation이랑 다르다.
CADE는 사후 explanation 특성(sample x가 왜 family C로 판별 됐는지 설명)을 갖고있는 반면,
본 논문에서는 사전 explanation(임베딩 상에서 훈련시, 전문가가 해당 멀웨어 샘플의 행동 관련 정보를 주입)을 뜻한다.

특히 이러한 explanation을 통해 본 논문에서는 존재하는 모델 대비 동일 성능 달성 위해 76.6%나 적은 샘플을 추가해도 됐다. (아무래도 사람이 정의한 행동 관련 정보가 큰 도움이 된듯하다.)

Revisiting Concept Drift in Windows Malware Detection: Adaptation to Real Drifted Malware with Minimal Samples, arXiv, NDSS 2025

In applying deep learning for malware classification, it is crucial to account for the prevalence of malware evolution, which can cause trained classifiers to fail on drifted malware. Existing solutions to address concept drift use active learning. They select new samples for analysts to label and then retrain the classifier with the new labels. Our key finding is that the current retraining techniques do not achieve optimal results. These techniques overlook that updating the model with scarce drifted samples requires learning features that remain consistent across pre-drift and post-drift data. The model should thus be able to disregard specific features that, while beneficial for the classification of pre-drift data, are absent in post-drift data, thereby preventing prediction degradation. In this paper, we propose a new technique for detecting and classifying drifted malware that learns drift-invariant features in malware control flow graphs by leveraging graph neural networks with adversarial domain adaptation. We compare it with existing model retraining methods in active learning-based malware detection systems and other domain adaptation techniques from the vision domain. Our approach significantly improves drifted malware detection on publicly available benchmarks and real-world malware databases reported daily by security companies in 2024. We also tested our approach in predicting multiple malware families drifted over time. A thorough evaluation shows that our approach outperforms the state-of-the-art approaches.
=>
기존 active learning은 적인 드리프트된 샘플과 함께 모델을 retraining할때,
드리프드 전 데이터와 드리프트 후 데이터에 일관되게 유지되는 feature을 학습하는것을 필요로 한다는것을 간과한다.
그래서 그 모델은 특정 feature(드리프트 후 데이터안에서는 부재중인 feature, 반면에 드리프트 전 데이터 분류기에선 유익했던 특성)을 배제할수 있어야한다.

위는 무슨 말이냐면, 2020년엔 class A의 주요 특성으로 feature D가 중요했다면, 2023년엔 class A에 concept drift가 발생해서 feature D가 중요하지 않다. 그래서 해당 샘플이 class A가 아니라 class C나 D로 판별된다. 이를 일반화 시키기 위해 feature D를 제거해서 해당 샘플이 class A로 판별될수 있게 한다.

본 논문은, drift-invarian feature을 학습한 drift된 malware을 분류 또는 탐지하기 위한 새로운 방안을 제안한다 (멀웨어 제어 흐름 그래프 안에서, GNN(그래프 뉴럴 네트워크)을 활용하면서 adversarial domain adaptation과 함께).

여기서 말하는 domain adaptation 메커니즘은,
minmax game 통해 D가 드리프트 샘플인지 아닌지 분간하기 어렵게 하는 invariant feature을 알아내는게 목표이고 그 feature을 이용해 Domain adaptation 한다.

Towards Explainable Drift Detection and Early Retrain in ML-Based Malware Detection Pipelines, Springer Lecture Notes in Computer Science (Springer DIMVA 2025)

The current largest challenge in ML-based malware detection is main- taining high detection rates while samples evolve. Although multiple works have proposed drift detectors and retraining-aware pipelines that work with reasonable efficiency, none of these detectors and pipelines are currently explainable, which limits our understanding of the threats’ evolution and the detector’s efficiency. Despite previous works that presented taxonomies of concept drift events, no prac- tical solution for explainable drift detection in malware pipelines existed until this work. Our insight to change this scenario is to split the classifier knowledge into two: (1) the knowledge about the frontier between Malware (M) and Goodware (G); and (2) the knowledge about the concept of the (M and G) classes. Thus, we can understand whether the concept or the classification frontier changed by measuring the variations in these two domains. We make this approach practical by deploying a pipeline with meta-classifiers to measure these sub-classes of the main malware detector. We demonstrate via 5K+ experiment runs the viability of our solution by (1) illustrating how it explains every drift point of the DREBIN and AndroZoo datasets and (2) how an explainable drift detector makes online retraining to achieve higher rates and requires fewer retraining points.
=>
기존 concept drift 탐지 모델엔 설명 기능은 없었다.
이를 해결하기 위해 본 논문은 새로운 framework 소개.
그 framework는,
M:malware, G:goodware을 나누어서 이에 대해
frontier drift, class drift를 설명 가능하게 함.
기존 분류기 위에 올려 놓는 방식.
사실 CADE도 설명 가능한데, CADE는 class drift만 설명하고
본 논문에서 제안한 방식은 fontier drift도 가능.
근데 이 논문은 부실하다고 느껴지는게 각 family별 설명이 아니라 단순히 binary한 class에 대해만 설명 가능한듯 하다.

MAGIC: Detecting Advanced Persistent Threats via Masked Graph Representation Learning, Usenix Security Symposium

Advance Persistent Threats (APTs), adopted by most delicate attackers, are becoming increasing common and pose great threat to various enterprises and institutions. Data provenance analysis on provenance graphs has emerged as a common approach in APT detection. However, previous works have exhibited several shortcomings: (1) requiring attack-containing data and a priori knowledge of APTs, (2) failing in extracting the rich contextual information buried within provenance graphs and (3) becoming impracticable due to their prohibitive computation overhead and memory consumption.

In this paper, we introduce MAGIC, a novel and flexible self-supervised APT detection approach capable of performing multi-granularity detection under different level of supervision. MAGIC leverages masked graph representation learning to model benign system entities and behaviors, performing efficient deep feature extraction and structure abstraction on provenance graphs. By ferreting out anomalous system behaviors via outlier detection methods, MAGIC is able to perform both system entity level and batched log level APT detection. MAGIC is specially designed to handle concept drift with a model adaption mechanism and successfully applies to universal conditions and detection scenarios. We evaluate MAGIC on three widely-used datasets, including both real-world and simulated attacks. Evaluation results indicate that MAGIC achieves promising detection results in all scenarios and shows enormous advantage over state-of-the-art APT detection approaches in performance overhead.
=>
APTs를 탐지하기 위한 데이터, provenance graph (audit log을 그래프로 변환)이 가지고 있는 맥락적, 구조적 정보가 충분하지 않을 뿐더러 비용적으로 비효율적.

이러한 한계를 해결 위해 세가지 정도를 제안.
(1) Masked Graph Representation Learning

CANDICE: An explainable and intelligent framework for network intrusion detection, Future Generation Computer Systems

In recent years, Deep Learning-based Network Intrusion Detection System (DL-NIDS) have demonstrated remarkable performance in detecting cyberattacks in network traffic. However, the lack of explainability for DL-NIDSs prevents end-users from trusting and understanding the detection results, thereby limiting their applications in practice. Although several approaches have been proposed to explain DL-NIDS, they run the risk of providing unfaithful explanations. In addition, existing methods merely output a set of important features as explanation, which is insufficient for end-users to thoroughly understand the attack. In this paper, we propose CANDICE, an explainable and intelligent framework for detecting and explaining intrusions in network traffic. Differing from existing works, CANDICE is highlighted by: (i) providing faithful explanation by disentangling the traffic representations and generating counterfactual explanations, and (ii) offering end-users a comprehensive view of the attack by generating an intrusion profile based on the explanation. We conduct experiments on four representative traffic datasets to evaluate the effectiveness of CANDICE. The results demonstrate that CANDICE surpasses existing methods in terms of explanation fidelity, sparsity, stability, and efficiency, while achieving high accuracy of above 96.10% in detecting intrusions.
=>
기존 논문에선 설명하는 논문들이 있긴했는데 단순히 feature 차이만 설명해주는 (CADE 등, 아 아닌것 같다. concept drift에 대한게 아니다.) 연구가 많았다.
본 논문에선,
(1) 충실한 설명을 제공한다.
(2) end-user에게 공격에 대한 이해가능한 view를 제공한다.
주의점
concept drift에 대한 설명이 아니다. 단순 공격에 대한 설명이다.

Explainable Malware Analysis: Concepts, Approaches and Challenges

위 논문을 보고 느낀 점은 CADE와 SHAP의 explanation 차이이다. (XAI 관점)
SHAP은 sample이 왜 class b로 판명되었는지 보통은 softmax classifier 예측 점수를 통해 설명한다. 어느 특성이 class b로 판명하는데 가장 큰 역할을 했는지.

반면에 CADE는 sample이 들어왔을때 OOD 바깥으로 벗어난 샘플에 대해서 해당 샘플이 왜 드리프팅 샘플로 판명된지 설명 가능하다. feature a 혹은 b를 바꿨을땐 특성 클래스에 가까워져 OOD 안으로 들어왔다면, 그 특성의 차이로 인해 드리프팅이 발생했다는 설명이다.

김민상

Concept Drift, Imbalanced Dataset, XAI

이전 포스트

CUDA, GPU 관련 이슈 정리

다음 포스트