
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation 논문에 대하여...

Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA Relaxed RT에 대한 내용 Abstract Contribution 기존 weight unshared model로

Early-Exit Deep Neural Network - A Comprehensive Survey 논문에 대하여...

TRM에 대한 내용TRM은 HRM보다 훨씬 더 단순한 재귀적 추론 접근 방식으로, 단 2개의 레이어만 가진 하나의 초소형 네트워크를 사용하면서도 HRM보다 훨씬 더 높은 일반화 성능을 달성.HRM: 27MTRM: 7M1) recursive hierarchical rea

RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers 논문에 대하여...

APQ-ViT: Towards Accurate Post-Training Quantization for Vision Transformer 논문에 대하여...

FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer 논문에 대하여...

Ptq4vit: Post-training quantization for vision transformers with twin uniform quantization 논문에 관하여...

Post-Training Quantization for Vision Transformer 논문에 관하여...

PACT에 관하여...

Mixed-Precision Neural Network Quantization via Learned Layer-wise Importance에 대하여...

Explaining NonLinear Classification Decisions with Deep Taylor Decomposition 논문에 대하여...

Mix-QViT: Mixed-Precision Vision Transformer Quantization Driven by Layer Importance and Quantization Sensitivity 논문에 대하여...

SpecEE: Accelerating Large Language Model Inference with Speculative Early Exiting 논문에 대하여...

BinaryViT: Pushing Binary Vision Transformers Towards Convolutional Models 논문에 대하여...

A Survey of Quantization Methods for Efficient Neural Network Inference 논문에 대하여...

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models 논문에 대하여...