[Paper] NMT Training Method for Korean-English Idiom Machine Translation

Judy·2022년 10월 11일
0

Paper

목록 보기
2/3
Strategy for effective training of Korean-English idioms in NMT

1. Abstract

  • Training method using special token(tag) and KISS dataset
  • After research of 'A Dataset and Evaluating Method for Korean-English Idiom Machine Translation’

2. Related Research

A Dataset and Evaluating Method for Korean-English Idiom Machine Translation

3. Dataset for experiment

  • Special token(tag) to each 4 dataset
      1. Non-idiom sentence dataset
      1. Idiom sentence dataset
      1. idiom sentence dataset
      • Attach ‘’ tag in front of idiom word
      1. idiom sentence dataset
      • Attach ‘’ tag in front of idiom word and ‘’ tag back of idiom word

4. Experiment (Training)

OpenNMT-py, model hyper-parameters are:

  • 2 Layer LSTM
  • 4 Layer LSTM
  • Transformer

Best : type 3 (‘’ tag & 4 layer transformer)

5. Conclusion

  • In normal, idiom sentence BLEU score is lower than non-idiom sentence
  • For efficiently train Korean idiom to NMT, attach special token only in front of idiom.

Reference Project 🌳🦜

https://github.com/itisused/2021_NLP_Project

profile
NLP Researcher

0개의 댓글