Label-synchronous = soft alignment = implicit alignment
ex. AED(Attention-based Encoder Decoder model)
Frame-synchronous = hard alignment = explicit alignment
ex. CTC
참고 논문
[Dong et al.,2020] A Comparison of Label-Synchronous and Frame-Synchronous End-to-End Models for Speech Recognition
https://arxiv.org/pdf/2005.10113.pdf