NLG(Natural Language Generation)을 학습하고 NLG의 subtask 중 하나인 Extractive Summarizationd에 대해 간단히 정리했습니다.
Abstractive Summarization
보다 나올 수 있는 결과물이 제한적이지만 상대적으로 구현이 쉽다는 장점을 가진다.
ex={'id': '0054d6d30dbcad772e20b22771153a2a9cbeaf62',
'article': '(CNN) -- An American woman died aboard a cruise ship that docked at Rio de Janeiro on Tuesday, the same ship on which 86 passengers previously fell ill, according to the state-run Brazilian news agency, Agencia Brasil. The American tourist died aboard the MS Veendam, owned by cruise operator Holland America. Federal Police told Agencia Brasil that forensic doctors were investigating her death. The ship's doctors told police that the woman was elderly and suffered from diabetes and hypertension, according the agency. The other passengers came down with diarrhea prior to her death during an earlier part of the trip, the ship's doctors said. The Veendam left New York 36 days ago for a South America tour.'
'highlights': 'The elderly woman suffered from diabetes and hypertension, ship's doctors say .\nPreviously, 86 passengers had fallen ill on the ship, Agencia Brasil says .'}
Sementic Text Matching
은 보다 더 좋은 요약문일 수록 Semantic Space에서 원문과 유사도가 높을 것이라는 가정에 기반한다는 점에서 Text Similarity 분석과 유사하다.: Denoising Autoencoder method를 적용한 Seq2Seq에 기반한 pretrained model.
좋은 글 감사합니다