
Each Complexity Deserves a Pruning Policy 논문에 대하여...

PACT코드PACT: Pruning and Clustering-Based Token Reduction for Faster Visual Language Models 논문에 대하여...

An Image is Worth 1/2 Tokens After Layer 2: Plug-and-PLay Acceleration for VLLM Inference 논문에 대하여...

Activation Quantization of Vision Encoders Needs Prefixing Registers 논문에 대하여...

Towards Understanding Best Practices for Quantization of Vision-Language Models 논문에 대하여...

Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation 논문에 대하여...

Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA Relaxed RT에 대한 내용 Abstract Contribution 기존 weight unshared model로

Early-Exit Deep Neural Network - A Comprehensive Survey 논문에 대하여...

TRM에 대한 내용TRM은 HRM보다 훨씬 더 단순한 재귀적 추론 접근 방식으로, 단 2개의 레이어만 가진 하나의 초소형 네트워크를 사용하면서도 HRM보다 훨씬 더 높은 일반화 성능을 달성.HRM: 27MTRM: 7M1) recursive hierarchical rea

RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers 논문에 대하여...

APQ-ViT: Towards Accurate Post-Training Quantization for Vision Transformer 논문에 대하여...

FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer 논문에 대하여...

Ptq4vit: Post-training quantization for vision transformers with twin uniform quantization 논문에 관하여...

Post-Training Quantization for Vision Transformer 논문에 관하여...

PACT에 관하여...

Mixed-Precision Neural Network Quantization via Learned Layer-wise Importance에 대하여...

Explaining NonLinear Classification Decisions with Deep Taylor Decomposition 논문에 대하여...