https://arxiv.org/pdf/1708.03665

Google uses continuous streams of data from
industry partners in order to deliver accurate results to users.
Unexpected drops in traffic can be an indication of an underlying
issue and may be an early warning that remedial action may be
necessary. Detecting such drops is non-trivial because streams
are variable and noisy, with roughly regular spikes (in many
different shapes) in traffic data. We investigated the question of
whether or not we can predict anomalies in these data streams.
Our goal is to utilize Machine Learning and statistical
approaches to classify anomalous drops in periodic, but noisy,
traffic patterns. Since we do not have a large body of labeled
examples to directly apply supervised learning for anomaly
classification, we approached the problem in two parts. First we
used TensorFlow to train our various models including DNNs,
RNNs, and LSTMs to perform regression and predict the
expected value in the time series. Secondly we created anomaly
detection rules that compared the actual values to predicted
values. Since the problem requires finding sustained anomalies,
rather than just short delays or momentary inactivity in the data,
our two detection methods focused on continuous sections of
activity rather than just single points. We tried multiple
combinations of our models and rules and found that using the
intersection of our two anomaly detection methods proved to be
an effective method of detecting anomalies on almost all of our
models. In the process we also found that not all data fell within
our experimental assumptions, as one data stream had no
periodicity, and therefore no time based model could predict it.Google uses continuous streams of data from
industry partners in order to deliver accurate results to users.
Unexpected drops in traffic can be an indication of an underlying
issue and may be an early warning that remedial action may be
necessary. Detecting such drops is non-trivial because streams
are variable and noisy, with roughly regular spikes (in many
different shapes) in traffic data. We investigated the question of
whether or not we can predict anomalies in these data streams.
Our goal is to utilize Machine Learning and statistical
approaches to classify anomalous drops in periodic, but noisy,
traffic patterns. Since we do not have a large body of labeled
examples to directly apply supervised learning for anomaly
classification, we approached the problem in two parts. First we
used TensorFlow to train our various models including DNNs,
RNNs, and LSTMs to perform regression and predict the
expected value in the time series. Secondly we created anomaly
detection rules that compared the actual values to predicted
values. Since the problem requires finding sustained anomalies,
rather than just short delays or momentary inactivity in the data,
our two detection methods focused on continuous sections of
activity rather than just single points. We tried multiple
combinations of our models and rules and found that using the
intersection of our two anomaly detection methods proved to be
an effective method of detecting anomalies on almost all of our
models. In the process we also found that not all data fell within
our experimental assumptions, as one data stream had no
periodicity, and therefore no time based model could predict it.
Google uses continuous streams of data from industry partners in order to deliver accurate results to users.
-> 구글은 연속적으로 계속나오는 데이터를 사용한다. 실제 산업 파트너에서, 유저에게 정확한 결과를 전달하기 위해서
Unexpected drops in traffic can be an indication of an underlying issue and may be an early warning that remedial action may be necessary.
-> 트레픽에서 예상치 못한 드랍은 주요한 이슈의 지표가 될 수 있다. 그리고 초기 경보일 수 있다. 행동 수정이 필요할 수 있다는
Detecting such drops is non-trivial because streams are variable and noisy, with roughly regular spikes (in many different shapes) in traffic data.
-> 그러한 드랍을 찾는 것은 중요하다. 서비스중에 다양한고 노이지하기 때문이다. 거의 정기적으로 스파이크 친다. 트래픽 데이터에서
We investigated the question of whether or not we can predict anomalies in these data streams.
-> 우리는 질문을 조사한다. 우리가 데이터 스트림에서 이상치를 예측할 수 있을지 없을지에 대한
Our goal is to utilize Machine Learning and statistical approaches to classify anomalous drops in periodic, but noisy, traffic patterns.
-> 우리 목표는 머신러닝과 통계적 접근을 활용하는 것이다. 특정 기간에서(하지만 노이지하고 데이터가 몰려오는 상황에서) 이상한 드랍을 분류하기 위해,
First we used TensorFlow to train our various models including DNNs, RNNs, and LSTMs to perform regression and predict the expected value in the time series.
-> 먼저 우리는 우리의 다양한 모델인 DNNs, RNNs 그리고 LSTMs를 학습하기 위해 tensorflow를 사용했다. 시계열에서 회귀와 예측을 기대값으로 뽑아내기 위해
Secondly we created anomaly detection rules that compared the actual values to predicted values.
-> 두 번째고 우리는 이상치 탐지 규칙을 만들었다. 그 규칙은 예측값과 실제값을 비교한다.