시리즈

논문

1.Attention Is All You Need

Transformer

2025년 4월 1일

2.Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

RAG

2025년 5월 19일

3.On the Biology of a Large Language Model

1. Introduction LLM 은 많은 부분에서 black-box 의 형태의 모델로 보인다. >Our goal is to reverse engineer how these models work on the inside, so we may better understand them and assess their fitness for purpose. 논문은...

2025년 10월 18일

4.In-N-Out: A Parameter-Level API Graph Dataset for Tool Agent

API-Graph Benchmark Dataset

2026년 1월 6일

5.PPO

다음 내용은, PPO 논문을 읽으면서 gemini 에게 질문한 내용을 qa set 으로 정리한 내용입니다. (복기용) 1. PPO의 기본 개념 Q. 논문에서 말하는 "Policy Gradient Methods"는 정확히 무엇을 업데이트하는 것인가? 말 그대로 PM 모델(Policy Model, 정책 신경망)을 직접 업데이트하는 방법론입니다. Value...

2026년 2월 9일

6.Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs

다음 내용은 논문을 읽으면서 claude 에게 질문한 내용들을 토대로, 복기를 위해서 정리된 QA set 입니다.

2026년 2월 10일