BAI, Jinze, et al. Qwen technical report. arXiv preprint arXiv:2309.16609, 2023.
WANG, Peng, et al. Qwen2-vl: Enhancing vision-language model's perception of the world at any resolution. arXiv preprint arXiv:2409.12191, 2024.
CHU, Yunfei, et al. Qwen-audio: Advancing universal audio understanding via unified large-scale audio-language models. arXiv preprint arXiv:2311.07919
WU, Chenfei, et al. Qwen-image technical report. arXiv preprint arXiv:2508.02324, 2025.