lesskorrect.log

lesskorrect.log

시리즈

Mechanistic-Interpretability

1.[기계적 해석 기초] A Mathematical Framework for Transformer Circuits 이해하기/논문리뷰

Transformer의 기계적 해석을 위한 수학적인 기초를 소개한다.

2024년 8월 18일

2.[기계적 해석 기초] In-context learning과 Induction heads 이해하기/논문리뷰

Transformer 안에서 “복사 + 붙여넣기” 역할을 수행하는 circuit을 말한다.

2024년 8월 18일

3.[기계적 해석 기초] Causal Scrubbing 이해하기/논문리뷰

이번 글에서는 “causal scrubbing”, 즉 어떠한 mechanistic interpretation의 타당성을 실험하기 위해 고안된 체계적인 방법론을 다룬다.

2024년 8월 18일

4.Introduction to Mechanistic Interpretability (기계적 해석의 전반적인 설명)

Hi there! We're a small group from South Korea diving into the fascinating field of mechanistic interpretability (MI).

2024년 9월 14일

5.[기계적 해석 기초] 다의성(Polysemanticity)과 Sparse Autoencoder 설명

딥러닝 모델을 인간이 이해할 수 있게 치환하는 연구의 일종으로, Mechanistic interpretability(기계적 해석)은 모델의 작동 원리를 세부적으로 분석하고 설명하는 분야이다.

2024년 9월 22일