DeepSAD

TaeJong Kim·2024년 4월 8일

Anomaly Detection Semi-supervised learning

논문 초록 해석

목록 보기

1/7

논문 링크:https://arxiv.org/abs/1906.02694

원문

Deep approaches to anomaly detection have recently shown promising results over shallow methods on large and complex datasets. Typically anomaly detection is treated as an unsupervised learning problem. In practice however, one may have---in addition to a large set of unlabeled samples---access to a small pool of labeled samples, e.g. a subset verified by some domain expert as being normal or anomalous. Semi-supervised approaches to anomaly detection aim to utilize such labeled samples, but most proposed methods are limited to merely including labeled normal samples. Only a few methods take advantage of labeled anomalies, with existing deep approaches being domain-specific. In this work we present Deep SAD, an end-to-end deep methodology for general semi-supervised anomaly detection. We further introduce an information-theoretic framework for deep anomaly detection based on the idea that the entropy of the latent distribution for normal data should be lower than the entropy of the anomalous distribution, which can serve as a theoretical interpretation for our method. In extensive experiments on MNIST, Fashion-MNIST, and CIFAR-10, along with other anomaly detection benchmark datasets, we demonstrate that our method is on par or outperforms shallow, hybrid, and deep competitors, yielding appreciable performance improvements even when provided with only little labeled data.

해석

Deep approaches to anomaly detection have recently shown promising results over shallow methods on large and complex datasets.
-> 이상치 탐지 분야에서 딥러닝을 활용한 접근은 유망한 결과를 보여준다. 크고 복잡한 데이터셋에서 단순한 방법에 비해
Typically anomaly detection is treated as an unsupervised learning problem.
-> 전형적인 이상치 탐지는 비지도 학습 문제를 다룬다.
In practice however, one may have---in addition to a large set of unlabeled samples---access to a small pool of labeled samples, e.g. a subset verified by some domain expert as being normal or anomalous.
-> 그러나 실전에서, 아마 가질 것이다. 거대한 라벨링이 되지 않은 샘플에다가 약간의 라벨링된 샘플을. 예를들어 도메인 전문가에 의해 정상과 이상 데이터를 검증한 서브셋
Semi-supervised approaches to anomaly detection aim to utilize such labeled samples, but most proposed methods are limited to merely including labeled normal samples.
-> 이상치 탐지에서 준지도 학습적 접근은 라벨 데이터의 활용을 목표로 한다. 그러나 대부분의 제시된 방법들은 라벨링된 정상 데이터를 포함하는 것으로 제한한다.
Only a few methods take advantage of labeled anomalies, with existing deep approaches being domain-specific.
-> 오직 일부 방법만이 라벨링된 이상치의 이점을 가진다. 도메인에 특화된 기존의 딥러닝 방법에서
In this work we present Deep SAD, an end-to-end deep methodology for general semi-supervised anomaly detection.
-> 이번 작업에서 제시하는 Deep SAD은, end-to-end 딥러닝 방법으로 일반적인 준 지도학습 이상치 탐지이다.
We further introduce an information-theoretic framework for deep anomaly detection based on the idea that the entropy of the latent distribution for normal data should be lower than the entropy of the anomalous distribution, which can serve as a theoretical interpretation for our method.
-> 우린 나아가 딥러닝 이상치 탐지을 위한 정보이론 프레임워크를 소개할 것이다. 정상 데이터에 대한 잠재 분산의 엔트로피가 이상치 데이터 보다 낮을것을 기반으로 한, 그것은 우리 방법에 대한 이론적인 해석으로서 기능한다.
In extensive experiments on MNIST, Fashion-MNIST, and CIFAR-10, along with other anomaly detection benchmark datasets, we demonstrate that our method is on par or outperforms shallow, hybrid, and deep competitors, yielding appreciable performance improvements even when provided with only little labeled data.
-> MNIST, Fashion-MNIST, and CIFAR-10에 대한 실험으로 확장해서, 이상치 탐지 벤치마크 데이터셋을 따라, 우리는 구현한다. 우리 방법이 평가되거나 성과를 낼 것이다. 단순하고, 하이브리드, 딥러닝 경쟁자, 분야적(?) 유연한 성과 향상을, 약간의 라벨링된 데이터만 제공되었을 때 조차

단어

promising : 유망한
be on par : ~와 동등하다, 똑같다.
yielding : 굽힐 수 있는, 유연한

소감

오늘 부터 하루 한 편씩 abstract를 번역해 보려고 한다. 부족한 영어 실력을 채워야 나의 연구 생활에 도움이 될 것 같기 때문이다. Deep SAD는 현재 회사에서 이상치 탐지 테스크를 할 때 활용해 보려고 한는 방법이다. 라벨 데이터가 많지 않지만, 그렇다고 없지도 않을 때 활용할 수 있는 아주 대표적인 방법이다. 특히 encoder 부분을 다양한 모델로 혼합하여 사용할 수 있다는 점이 장점으로 보인다. 조만간 이 모델에 대해서 구현한 코드와 논문 전문을 분석한 글도 올리고자 한다.

TaeJong Kim

AI 엔지니어 김태종입니다. 추천시스템, 이상탐지, LLM에 관심이 있습니다. 블로그에는 공부한 기술, 논문 혹은 개인적인 경험을 올리고 있습니다.

다음 포스트

DeepSAD

논문 초록 해석

원문

해석

단어

소감

Time Series Anomaly Detection

0개의 댓글