제목에서 보여지듯이 네트워크 상에서 이상 징후를 감지하는 기술들에 대한 survey 논문을 정리한 글입니다.
읽으면서 나중에 다시 읽어볼만하거나 중요하다고 생각되는 부분만 작성했습니다.
Identifying the relationship among the attacks and anomalies
DoS - collective anomalies
Probe - contextual anomalies
For network anomaly detection, a neural network has been merged with other techniques, such as a statistical approach and variants of it.
The outlier factor is defined using the trained RNN as follows.
x_ij: the input value
o_ij: the output value
Self-organizing Maps (SOM) are used for network anomaly detection.
Ranadas et al. (2003) suggested that, using SOM, network traffic can be classified in real time.
chi-square theory
X_i: the observed value of the ith variable
E_i: the expected value of the ith variable
Shye et al. (2003) presented an easier way to analyze high dimensional network traffic dataset using PCA, PCAs are linear combinations of p random variables (A1, A2, ..., Ap) and can be characterized:
A brief mathematical formulation of PCA:
An anomaly detection technique based on PCA (Shyu et al., 2003) has the benefits of:
Information-theoretic measures can be used to create an appropriate anomaly detection model.
definitions of several measures:
Entropy is a basic concept of information theory which measures the uncertainty of a collection of data items.
데이터의 불확실성을 측정한다는 건 뭘까...?
where P(x) is the probability of x in D
Conditional entropy is the entropy of D given that Y is the entropy of the probability distribution P(x|y)
Information gain is a measure of the information gain of an attribute or feature A in a dataset D.
Unlike other clustering algorithms, co-clustering defines a clustering criterion and then optimizes it.
It simultaneously finds the subsets of rows and columns of a data matrix using a specified criterion.
The benefits of co-clustering over the regular clustering are the following: