[TIL] HTTP : The Definitive Guide "p187 ~ p191"

시윤·2025년 1월 14일

TIL http

[TIL] Two Pages Per Day

목록 보기

79/107

Chapter 7. Caching

(해석 또는 이해가 잘못된 부분이 있다면 댓글로 편하게 알려주세요.)

✏️ 원문 번역

Detailed Algorithms

The HTTP specification provides a detailed, but slightly obscure and often confusing, algorithm for computing document aging and cache freshness. In this section, we’ll discuss the HTTP freshness computation algorithms in detail (the “Fresh enough?” diamond in Figure7-12) and explain the motivation behind them.

HTTP 명세는 문서의 Age와 HTTP Freshness를 연산하는 알고리즘을 제공하고 있습니다. 상세하면서도 조금 애매하고 혼란스러운 구석이 있는 알고리즘입니다.
이번 섹션에서는 HTTP Freshness 연산 알고리즘에 대해 자세히 알아보고, 그 배경을 설명합니다.

This section will be most useful to readers working with cache internals. To help illustrate the wording in the HTTP specification, we will make use of Perl pseudocode. If you aren’t interested in the gory details of cache expiration formulas, feel free to skip this section.

특히 캐시 내부를 다루는 독자들에게 있어 이번 섹션이 매우 유용할 것입니다.
지금부터는 HTTP 명세의 서술을 표현하기 위해 Perl pseudocode를 사용하겠습니다.
캐시 만료 공식에 대한 자세한 설명에 관심이 없다면 이번 섹션은 건너뛰어도 좋습니다.

Age and Freshness Lifetime

To tell whether a cached document is fresh enough to serve, a cache needs to compute only two values: the cached copy’s age and the cached copy’s freshness lifetime. If the age of a cached copy is less than the freshness lifetime, the copy is fresh enough to serve. In Perl:
$is_fresh_enough = ($age < $freshness_lifetime);

캐싱된 문서가 제공되기에 충분히 Fresh 한지 판단하기 위해 캐시는 오직 두 가지 값을 가지고 연산합니다.
하나는 캐싱된 사본의 Age, 또다른 하나는 캐싱된 사본의 Freshness 생명주기입니다.
만약 캐싱된 사본의 Age가 Freshness 생명주기보다 작다면 사본이 전송할 수 있을 정도로 충분히 Fresh 하다고 판단합니다.
Perl 의사코드로는 다음과 같이 표현할 수 있습니다.
```
	$is_fresh_enough = ($age < $freshness_lifetime);
```

The age of the document is the total time the document has “aged” since it was sent from the server (or was last revalidated by the server).* Because a cache might not know if a document response is coming from an upstream cache or a server, it can’t assume that the document is brand new. It must determine the document’s age, either from an explicit Age header (preferred) or by processing the server-generated Date header.

문서의 Age는 서버로부터 전송된 이후로(혹은 서버로부터 마지막으로 재검증된 이후로) 흐른 전체 시간을 의미합니다.
캐시는 문서에 대한 응답이 상위 캐시에서 왔는지 서버에서 왔는지 알 수 없기 때문에 문서가 완전히 최신의 것이라 장담할 수 없습니다.
캐시는 명시적인 Age 헤더(권장되는 방식)를 통해 혹은 서버가 생성한 Date 헤더를 가공하여 문서의 Age를 결정해야 합니다.

The freshness lifetime of a document tells how old a cached copy can get before it is no longer fresh enough to serve to clients. The freshness lifetime takes into account the expiration date of the document and any freshness overrides the client might request.

문서의 Freshness 생명주기는 캐싱된 사본이 클라이언트에게 제공 가능할 정도로 Fresh 한 기간을 나타냅니다.
Freshness 생명주기는 문서의 만료일시와 클라이언트가 요청할 수 있는 모든 Freshness 재정의가 고려됩니다.

Some clients may be willing to accept slightly stale documents (using theCache-Control: max-stale header). Other clients may not accept documents that will become stale in the near future (using the Cache-Control: min-fresh header). The cache combines the server expiration information with the client freshness requirements to determine the maximum freshness lifetime.

어떤 클라이언트는 Cache-Control: max-stale 헤더를 사용하여 조금 오래된 문서를 허용할 수도 있습니다.
또다른 클라이언트는 Cache-Control: min-fresh 헤더를 사용하여 가까운 미래에 Stale이 될 문서를 허용하지 않을 수도 있습니다.
캐시는 서버의 만료 정보를 클라이언트의 Freshness 요청과 결합하여 최대 Freshness 생명주기를 결정합니다.

Age Computation

The age of the response is the total time since the response was issued from the server (or revalidated from the server). The age includes the time the response has floated around in the routers and gateways of the Internet, the time stored in intermediate caches, and the time the response has been resident in your cache. Example7-1 provides pseudocode for the age calculation.

응답의 Age는 서버로부터 응답이 발행된 이후로(혹은 서버로부터 재검증된 이후로) 흐른 전체 시간을 나타냅니다.
Age에는 응답이 인터넷의 라우터와 게이트웨이를 떠돌아다닌 시간과 중간 캐시에 저장된 시간, 그리고 여러분의 캐시에 응답이 상주한 시간이 포함됩니다.
Example 7-1은 Age 연산에 대한 pseudocode를 나타냅니다.

The particulars of HTTP age calculation are a bit tricky, but the basic concept is simple. Caches can tell how old the response was when it arrived at the cache by examining the Date or Age headers. Caches also can note how long the document has been sitting in the local cache. Summed together, these values are the entire age of the response. HTTP throws in some magic to attempt to compensate for clock skew and network delays, but the basic computation is simple enough:
$age = $age_when_document_arrived_at_our_cache +
$how_long_copy_has_been_in_our_cache;

HTTP Age 연산의 세부사항이 살짝 까다로울 수 있지만 개념 자체는 간단합니다.
캐시는 Date나 Age 헤더를 통해 캐시에 도착했을 때 응답이 얼마나 오래되었는지를 계산할 수 있습니다.
또한 문서가 로컬 캐시에 얼마나 오랜 시간 상주해 있었는지도 알 수 있습니다.
이 값들을 전부 합하면 응답의 전체 Age가 됩니다.
HTTP가 시계의 왜곡과 네트워크 지연을 보정하기 위해 몇 가지 마법을 부리지만, 기본적인 연산은 충분히 간단합니다.
```
	$age = $age_when_document_arrived_at_our_cache +
	$how_long_copy_has_been_in_our_cache;
```

A cache can pretty easily determine how long a cached copy has been cached locally(a matter of simple bookkeeping), but it is harder to determine the age of a response when it arrives at the cache, because not all servers have synchronized clocks and because we don’t know where the response has been. The complete age-calculation algorithm tries to remedy this.

캐시는 사본이 로컬에 저장된 기간을 생각보다 쉽게 확인할 수 있습니다(간단한 장부 기록의 문제).
하지만 응답이 캐시에 도착했을 때의 Age를 결정하는 것이 어렵습니다. 모든 서버가 동기화된 시계를 가지고 있지 않을 뿐더러 이 응답이 어디에서 왔는지도 모르기 때문입니다.
완벽한 Age 연산 알고리즘은 이 문제를 해결하고자 합니다.

[1] Apparent Age is Based on the Date Header

If all computers shared the same, exactly correct clock, the age of a cached document would simply be the “apparent age” of the document—the current time minus the time when the server sent the document. The server send time is simply the value of the Date header. The simplest initial age calculation would just use the apparent age:
$apparent_age = $time_got_response - $Date_header_value;
$age_when_document_arrived_at_our_cache = $apparent_age;

만약 모든 컴퓨터가 정확히 동일하고 올바른 시계를 공유하고 있었다면, 캐싱된 문서의 Age는 단순히 문서의 "겉보기 나이(현재 시각에서 서버가 문서를 전송한 시각을 뺀 값)"가 되었을 것입니다.
이 경우 서버의 전송 시각은 Date 헤더의 값으로 간소화됩니다.

때문에 가장 간단한 초기 Age 연산은 그저 겉보기 나이를 사용하는 것입니다.

	$apparent_age = $time_got_response - $Date_header_value;
	$age_when_document_arrived_at_our_cache = $apparent_age;

Unfortunately, not all clocks are well synchronized. The client and server clocks may differ by many minutes, or even by hours or days when clocks are set improperly.

하지만 안타깝게도 모든 시계는 동기화가 제대로 되어있지 않습니다.
시계가 적절하게 설정되어 있지 않다면 클라이언트와 서버의 시계는 몇 분, 몇 시간, 혹은 며칠 이상의 차이가 발생할 수 있습니다.

Web applications, especially caching proxies, have to be prepared to interact with servers with wildly differing clock values. The problem is called clock skew—the difference between two computers’ clock settings. Because of clock skew, the apparent age sometimes is inaccurate and occasionally is negative.

특히 캐싱 프록시와 같은 웹 응용 프로그램은 서로 다른 시간값을 가진 서버와 상호작용할 준비가 되어 있어야 합니다.
이처럼 두 컴퓨터간의 Clock 설정 차이에 의해 발생하는 문제를 Clock Skew 라고 합니다.
Clock Skew로 인해 겉보기 Age는 부정확할 수 있고 경우에 따라 음수가 나오기도 합니다.

If the age is ever negative, we just set it to zero. We also could sanity check that the apparent age isn’t ridiculously large, but large apparent ages might actually be correct. We might be talking to a parent cache that has cached the document for a long time (the cache also stores the original Date header):
$apparent_age = max(0, $time_got_response - $Date_header_value);
$age_when_document_arrived_at_our_cache = $apparent_age;

만약 Age가 음수로 나온다면 0으로 설정해줄 수 있습니다.
또한 겉보기 나이가 비정상적으로 크지는 않은지, 겉보기 나이가 실제로는 정확한지 확인할 필요도 있습니다.
캐시가 문서가 오랜 시간 저장되어 있는 상위 캐시(캐시에는 원본 Date 헤더도 저장됨)에 연결되어 있을 수도 있습니다.

Be aware that the Date header describes the original origin server date. Proxies and caches must not change this date!

Date 헤더는 원본 서버의 날짜를 나타냄에 유의합니다.
프록시와 캐시가 이 날짜를 수정해서는 안 됩니다.

[2] Hop-by-Hop Age Calculations

So, we can eliminate negative ages caused by clock skew, but we can’t do much about overall loss of accuracy due to clock skew. HTTP/1.1 attempts to work around the lack of universal synchronized clocks by asking each device to accumulate relative aging into an Age header, as a document passes through proxies and caches. This way, no cross-server, end-to-end clock comparisons are needed.

[1]에서 우리는 Clock Skew에 의해 발생하는 음수 Age 값을 제거하였습니다.
하지만 Clock Skew로 인한 전반적인 정확도 감소에 대해서는 조치를 취할 수 없었습니다.
HTTP/1.1은 문서가 프록시와 캐시를 통과할 때 각각의 장치에 대한 상대적인 나이를 Age 헤더에 누적하도록 요청하여 전역 동기화 시계의 부재를 해결하고자 했습니다.
즉, 서버간, 엔드투엔드 방식의 clock 비교가 필요하지 않게 되었습니다.

The Age header value increases as the document passes through proxies. HTTP/1.1-aware applications should augment the Age header value by the time the document sat in each application and in network transit. Each intermediate application can easily compute the document’s resident time by using its local clock.

Age 헤더의 값은 문서가 프록시를 통과할 때마다 점점 커집니다.
HTTP/1.1-aware 응용 프로그램은 문서가 각각의 응용 프로그램과 네트워크 전송중에 머물렀던 시간에 따라 Age 헤더의 값을 보충합니다.
각각의 중간 응용 프로그램은 로컬 Clock을 사용하여 문서의 상주 시간을 쉽게 계산할 수 있습니다.

However, any non-HTTP/1.1 device in the response chain will not recognize the Age header and will either proxy the header unchanged or remove it. So, until HTTP/1.1 is universally adopted, the Age header will be an underestimate of the relative age.

그러나, 응답 체인 내의 non-HTTP/1.1 장치는 Age 헤더를 인식하지 못한 채 헤더를 바꾸지 않고 전달하거나 삭제할 수 있습니다.
따라서 HTTP/1.1이 전역적으로 채택되지 않은 한 Age 헤더에 담긴 상대 나이가 실제보다 작을 수 있습니다.

The relative age values are used in addition to the Date-based age calculation, and the most conservative of the two age estimates is chosen, because either the cross-server Date value or the Age-computed value may be an underestimate (the most conservative is the oldest age). This way, HTTP tolerates errors in Age headers as well, while erring on the side of fresher content:
$apparent_age = max(0, $time_got_response - $Date_header_value);
$corrected_apparent_age = max($apparent_age, $Age_header_value);
$age_when_document_arrived_at_our_cache = $corrected_apparent_age;

상대적인 Age 값은 Date 기반의 Age 연산에 덧붙여 사용됩니다.
서버간 Date 값이나 연산된 Age 값 모두 실제보다 작게 도출되기 때문에, 두 가지 Age 추정치 중 가장 보수적인 것(The Oldest Age)을 선택합니다.

이러한 방식으로 HTTP는 Age 헤더의 오류를 허용하지만 최신 콘텐츠에 대해서도 오류도 허용할 수 있습니다.

	$apparent_age = max(0, $time_got_response - $Date_header_value);
	$corrected_apparent_age = max($apparent_age, $Age_header_value);
	$age_when_document_arrived_at_our_cache = $corrected_apparent_age;

[3] Compensating for Network Delays

Transactions can be slow. This is the major motivation for caching. But for very slow networks, or overloaded servers, the relative age calculation may significantly under-estimate the age of documents if the documents spend a long time stuck in network or server traffic jams.

트랜잭션은 느릴 수 있습니다. 이것이 캐싱을 시작하게 된 가장 근본적인 원인이기도 합니다.
하지만 매우 느린 네트워크나 과부하된 서버로 인해 문서가 네트워크나 서버 혼잡 속에 오랫동안 머물게 되는 경우, 상대적인 Age 연산이 극심하게 문서의 Age를 과소평가할 수 있습니다.

The Date header indicates when the document left the origin server,* but it doesn’t say how long the document spent in transit on the way to the cache. If the document came through a long chain of proxies and parent caches, the network delay might be significant.

Date 헤더는 문서가 원본 서버로부터 출발한 시점을 가리킬 뿐, 문서가 캐시로 전송되는 동안 얼마나 오래 머물렀는지를 알려주지는 않습니다.
만약 문서가 기다란 프록시와 상위 캐시 체인을 통해 전달된다고 하면, 유의미한 네트워크 지연이 있을 것입니다.

There is no easy way to measure one-way network delay from server to cache, but it is easier to measure the round-trip delay. A cache knows when it requested the document and when it arrived. HTTP/1.1 conservatively corrects for these network delays by adding the entire round-trip delay. This cache-to-server-to-cache delay is an over-estimate of the server-to-cache delay, but it is conservative. If it is in error, it will only make the documents appear older than they really are and cause unnecessary revalidations. Here’s how the calculation is made:
$apparent_age = max(0, $time_got_response - $Date_header_value);
$corrected_apparent_age = max($apparent_age, $Age_header_value);
$response_delay_estimate = ($time_got_response - $time_issued_request);
$age_when_document_arrived_at_our_cache = $corrected_apparent_age + $response_delay_estimate;

서버로부터 캐시로의 단방향 네트워크 지연을 계산하는 것은 전혀 간단하지 않습니다.
반면 Round-trip 지연을 측정하는 것은 그에 비해 간단합니다.
캐시는 문서를 요청한 시점과 문서가 도착한 시점을 알고 있습니다.
HTTP/1.1은 전체 Round-trip 지연을 더함으로써 네트워크 지연을 보수적으로 보정합니다.
이러한 캐시-서버-캐시 지연은 서버-캐시 지연을 과대평가하므로(원래 계산하려던 서버-캐시 지연보다 큰 값이 나오므로) 보수적입니다.
오류가 발생하면 문서가 실제보다 더 오래된 것으로 표시되어 불필요한 재검증이 발생할 수 있습니다.

연산 방법은 다음과 같습니다.

	$apparent_age = max(0, $time_got_response - $Date_header_value);
	$corrected_apparent_age = max($apparent_age, $Age_header_value);
	$response_delay_estimate = ($time_got_response - $time_issued_request);
	$age_when_document_arrived_at_our_cache = $corrected_apparent_age + $response_delay_estimate;

✏️ 요약

HTTP Freshness Calculation

isFresh = Age < FreshnessLifetime

isFresh = 캐싱된 문서의 Fresh 여부
Age = 서버로부터 문서에 대한 응답이 전송된 이후로 흐른 전체 시간
FreshnessLifetime = 캐싱된 사본을 Fresh로 간주하는 시간

Age Computation

겉보기나이 = max(0, 응답도착시점 - DateHeader값)
보정된겉보기나이 = max(apparentAge, AgeHeader값)
응답지연추정치 = 응답도착시점 - 요청출발시점
캐시에도착한문서의나이 = 보정된겉보기나이 + 응답지연추정치
캐시상주기간 = 현재시점 - 응답도착시점

Age = 캐시에도착한문서의나이 + 캐시상주기간

겉보기나이 : 응답이 도착한 시점에서 응답을 전송한 시점을 뺀 값
(Clock Skew로 인해 음수가 나오는 것을 방지하기 위해 최솟값을 0으로 설정)
보정된겉보기나이 : 겉보기나이와 Hop-by-Hop으로 누적된 Age 헤더의 값 중 더 보수적인(Older) 값
(Hop-by-Hop으로 계산하는 경우 Clock Skew 문제 해결 가능)
응답지연추정치 : 응답이 도착한 시점에서 요청이 출발한 시점을 뺀 값
캐시에도착한문서의나이 : 보정된 겉보기 나이에 응답지연추정치를 더한 값
(겉보기나이는 네트워크 지연을 고려하지 않으므로 응답지연추정치 추가)
캐시상주기간 : 현재 시점에서 응답이 도착한 시점을 뺀 값

** Age를 계산할 때 보수적인 값을 사용하는 이유는 Age를 실제보다 작게 측정해서 Stale인 문서를 전송하는 것보다 Age를 실제보다 높게 측정해서 Fresh인 문서를 Stale이라 평가하는 게 낫기 때문이다

✏️ 감상

직무유기 해버린 나의 뇌

캐시가 Freshness를 계산하는 과정이 이렇게까지 복잡할 줄은 몰랐다. 그냥 응답이 도착한 시점과 Date 헤더만 가지고 깔짝깔짝.. 계산하면 될 줄 알았는데, 세상의 모든 것이 그리 단순하지 않다는 사실을 간과하고 있었다....ㅋㅋㅋㅋ 캐시는 응답이 어디서 오는지 알 수 없다는 구절을 읽었을 때 머리를 한 대 씨게 얻어맞은 것 같았다. 아 또 Clock 동기화 때문에 문제가 생기겠구나 싶었다. 심지어 네트워크 지연까지 보정해야 하다니 보통 일이 아니라는 생각이 들었다. 원래 오차 보정이 제일 어려운 법이라고.. Kalman Filter PTSD가 왔다.

일단 영어로 된 텍스트를 해석하는 것만으로도 충분히 어려웠는데~~(오늘 범위 너무 고봉밥...)~~ 내용을 이해하는 것은 완전히 또 다른 이야기였다. 그동안 뇌 빼고 글 썼는데 오늘은 뇌를 좀 굴려야 했다. 아주 조금 똑똑해졌을지동...ㅎㅁㅎ

시윤

맑은 눈의 다람쥐

이전 포스트

[TIL] HTTP : The Definitive Guide "p186 ~ p187"

다음 포스트

[TIL] HTTP : The Definitive Guide "p187 ~ p191"

[TIL] Two Pages Per Day

Chapter 7. Caching

✏️ 원문 번역

Detailed Algorithms

Age and Freshness Lifetime

Age Computation

[1] Apparent Age is Based on the Date Header

[2] Hop-by-Hop Age Calculations

[3] Compensating for Network Delays

✏️ 요약

HTTP Freshness Calculation

Age Computation

✏️ 감상

직무유기 해버린 나의 뇌

[TIL] HTTP : The Definitive Guide "p186 ~ p187"

[TIL] HTTP : The Definitive Guide "p192 ~ p194"

0개의 댓글

관련 채용 정보