시리즈

Knowledge-distillation

1.Distilling the Knowledge in a Neural Network[.,2015]

Model compression방법으로 knowledge distillation를 설명하도록 하겠습니다. Knowledge distillation은 teacher network와 student network의 ensemble을 기반으로 한 방법이라 설명할 수 있습니다

2021년 12월 29일

2.Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results[.,2017]

본 연구에서는 기존에 알려져 있는 Temporal ensembling 방법에서 한 단계 나아가 Mean teacher방법을 제안합니다. Mean teacher는 model의 weight를 평균 내는 방법인데, 적은 수의 label만 가지고 이전 방법에 비해 좋은 성능을

2021년 12월 29일

3.Noise as a Resource for Learning in Knowledge Distillation[.,2019]

오늘은 2021년 IEEE winter conference에 accepted된 paper를 소개하도록 하겠습니다. 논문은 다음과 같습니다. Title : Noise as a Resource for Learning in Knowledge distillation Link

2022년 3월 6일

4.Improving BERT Fine-Tuning via Self Ensemble and Self-Distillation[., 2020]

오늘 소개드릴 논문은 다음과 같습니다. Improving BERT Fine-Tuning via Self Ensemble and Self-Distillation https://arxiv.org/abs/2002.10345기존 연구에 따르면 BERT계열의 pre-t

2022년 3월 12일