[ML] Hyperparameter Tuning: Learning rate and Batch size

박제연·2022년 10월 13일
0

GDSC-ML

목록 보기
2/5

Batch size and Learning rate

https://openreview.net/pdf?id=B1Yy1BxCZ

1. Batch Size

  • small: converges quickly at the cost of noise in the training process
  • large: converges slowly with accurate estimates of the error gradient

2. Learning Rate

The most popular form of learning rate annealing is a step decay
where the learning rate is reduced by some percentage after a set number of training epochs.

https://www.jeremyjordan.me/nn-learning-rate/

https://www.baeldung.com/cs/learning-rate-batch-size

https://inhovation97.tistory.com/32

Bag of Tricks for Image Classification with Convolutional Neural Networks

https://arxiv.org/abs/1812.01187

  • Increase Batch size
  • Linear scaling learning rate
  • Model Architecture Tweaks

    “A model tweak is a minor adjustment to the network architecture, such as changing the stride of a particular convolution layer. Such a tweak often barely changes the computational complexity but might have a non-negligible effect on the model accuracy.”

  • Training Refinements
    • Cosine Learning rate Decay
    • Label smoothing
    • Mixup training

https://norman3.github.io/papers/docs/bag_of_tricks_for_image_classification.html

https://medium.com/analytics-vidhya/bag-of-tricks-for-image-classification-with-convolutional-neural-networks-99f00a9b9565

https://phil-baek.tistory.com/entry/CNN-꿀팁-Bag-of-Tricks-for-Image-Classification-with-Convolutional-Neural-Networks-논문-리뷰

profile
읏차 웃자

0개의 댓글