BERT - Masked Language Modeling(MLM)

Ann Jongmin·2025년 3월 14일

BERT

목록 보기
2/6

Masked Language Modeling(MLM)

Transformer 라이브러리에서 지원하는 pipeline을 사용합니다.


Github

:github에 전체 코드가 있습니다.

git clone https://github.com/MachuEngine/BERT-TextAnalysis.git

Fill-Mask 코드

def fill_mask():
    """
        마스킹된 토큰을 예측하는 언어 모델(BERT) 데모
        - bert-base-multilingual-cased 모델 사용
    """
    mask_filler = pipeline("fill-mask", model="bert-base-multilingual-cased")
    masked_text = "I drank [MASK] today."
    predictions = mask_filler(masked_text)
    print(f"Input: {masked_text}")
    print("Predictions:")
    for pred in predictions:
        print(
            f"- {pred['sequence']} "
            f"(score={pred['score']:.4f}, token={pred['token_str']})"
        )

MASK 후보 top5

Input: I drank [MASK] today.
Predictions:
- I drank it today. (score=0.1909, token=it)
- I drank you today. (score=0.0367, token=you)
- I drank things today. (score=0.0223, token=things)
- I drank water today. (score=0.0188, token=water)
- I drank in today. (score=0.0178, token=in)
profile
AI Study

0개의 댓글