최신 논문 Tracking

ODD·2024년 8월 29일

논문

목록 보기
1/2

https://huggingface.co/papers
https://arxiv.org/list/cs.CV/recent
https://arxiv.org/list/cs.AI/recent

Key topics: Quantization, Pruning, Object detection, Transformer, Mamba

2024.08.30-2024.09.04

SA-MLP: Enhancing Point Cloud Classification with
Efficient Addition and Shift Operations in MLP
Architectures

  • follow-up study of ShiftAddNet
  • ShiftAddNet: doubling of #layers and limited representational
    capacity due to frozen shift weights
  • maintained the original #layers without freezing

One-Index Vector Quantization Based Adversarial Attack on Image Classification

  • one-index attack method in the VQ domain to generate adversarial images by a differential evolution algorithm
  • modifies a single index in the compressed data stream so that the decompressed image is misclassified

VQ4DiT: Efficient Post-Training Vector Quantization for Diffusion Transformers

  • VQ decomposes model weight into a codebook and assignments
  • prev VQ methods calibrate only
    the codebook without calibrating the assignments => weight sub-vectors being incorrectly assigned to the same assignment => providing inconsistent gradients
  • candidate assignment set \& reconstructs the sub-vector based on the weighted average
  • using the zero-data and block-wise calibration method, the
    optimal assignment from the set is efficiently selected
    (PTQ 과정에서 기존 연구는 quantization 오류만 줄이고, cluster 할당은 그대로 사용 => cluster 분포도 최적화가 필요!)

DREAMING IS ALL YOU NEED

  • SleepNet seamlessly integrates supervised learning
    with unsupervised “sleep" stages using pre-trained encoder models
  • DreamNet employs full encoder-decoder frameworks to reconstruct the hidden states, mimicking the human "dreaming" process

2024.08.22-2024.08.29 (hugging face)

Quantization
MobileQuant: Mobile-friendly Quantization for On-device Language Models
(Samsung AI Center, Cambridge)

  • PTQ
  • on-device deployment of LLMs using integer-only quantization
  • jointly optimizing the weight transformation and activation range parameters in an end-to-end manner

Pruning
LLM Pruning and Distillation in Practice: The Minitron Approach
(NVIDIA)

  • Not training each model, but performing pruning & knowledge distillation for small models
  • lack of access to the original training data => fine-tune the teacher model on our own dataset (teacher correction)

Mamba
ReMamba: Equip Mamba with Effective Long-Sequence Modeling

  • selective compression and adaptation techniques within a
    two-stage re-forward process

0개의 댓글