Prometheus documentation, a Counter is a single, monotonically increasing, cumulative metric.
It’s monotonically increasing, so it can only increase, usually one-by-one.
ex)
The total amount of HTTP requests
The total amount of log messages
The total amount of job executions
metric that go up and down = gauge
1분동안 60번 호출되었으면 대략 1tps가 나옴
alert를 정의할 때나, 느리게 변화하는 카운터를 그래프로 표현할 때 가장 적합
ex)
rate(http_requests_total{job="api-server"}[5m])
아래예제 표현식은 range 벡터에 있는 시계열마다 최대 5분 전까지 조회해서 가장 최근 데이터 포인트 2개를 찾고, 초당 HTTP 요철 비율을 반환한다.
irate(http_requests_total{job="api-server"}[5m])
https://godekdls.github.io/Prometheus/querying.functions/#irate
https://blog.voidmainvoid.net/449
https://levelup.gitconnected.com/prometheus-counter-metrics-d6c393d86076
https://www.innoq.com/en/blog/prometheus-counters/
One big advantage of histograms is that they can be aggregated. The following query returns the 99th percentile of response time across all APIs and instances:
histogram_quantile(0.99, sum by (le) (rate(http_request_duration_seconds_bucket[5m])))
This function will provide us the ability to create an estimate of the latency at a specific percentile.
To compute the 99th percentile (0.99 quantile) of response time for the add_product API running on host1.domain.com, you would use the following query:
histogram_quantile(0.99, rate(http_request_duration_seconds_bucket{api="add_product", instance="host1.domain.com"}[5m]))
https://engineering.statefarm.com/blog/observing-latency-tail/
https://promlabs.com/blog/2020/09/25/metric-types-in-prometheus-and-promql
https://www.timescale.com/blog/four-types-prometheus-metrics-to-collect/