Natural Language Processing with Probabilistic Models - Week 2

HO SEUNG YOON·2024년 7월 8일

Hidden Markov Models

  • (N+1)×N{(N+1)}\times N; N = number of hidden states

  • context에 따라서 tag가 달라질 수 있어서 확률이 0이 아니다.

Calculating Probabilities


  • Markov model, If you want to know transition probabilities you have to know all occurrences of tag pairs in training corpus

  • start token

  • transform all words to lowercase

Populating the Transition Matrix

고장남 수정 필요

The Viterbi Algorithm

  • auxiliary matrix C, D
    • n rows; number of parts of speech tags or hidden states in model
    • k columns; number of words in the given sequence
  • Viterbi path비터비 경로

Viterbi: Initialization

  • from initialization the first column of C and D matrix is populated

  • first column entries, products of the transition probabilities of the initial states and their respective emission probabilities

  • set first column to 0, no preceding POS tag

Viterbi: Forward Pass

  • calculate c1,2c_{1,2}
    b1,cindex(w2)b_{1,cindex(w2)} : emission probability from tag t1 towards w2
    ak,1a_{k,1} : transition probability from the POS tag tkt_k to the current tag t1t_1 and tk,1t_{k,1}

Viterbi: Backward Pass

0개의 댓글