profile
읏차 웃자
post-thumbnail

[Paper Review] Attention is all you need

Before.. RNN LSTM : slow to train Can we parallelize sequential data? Transformers Input sequence can be transmitted **parallel ** No concept of time step Pass all the words simultaneously and determine the word embedding simultaneously (RNN passes input word one after ano

2022년 11월 17일
·
0개의 댓글
·
post-thumbnail

[Paper Review] CoCa: Contrastive Captioners are Image-Text Foundation Models

"CoCa: Contrastive Captioners are Image-Text Foundation Models" History of Vision and Language training Vision pretraining pretrain ConvNets or Transformers on large-scale data such as ImageNet, Instagram to solve visual recognition problem these models only learn modes for the vision modality-> not applicable to joint reasoning task over both image and text inputs Vision-Lan

2022년 11월 3일
·
0개의 댓글
·
post-thumbnail

[GDSC-ML] Practice: Improve accuracy of ImageNet Classification project

- Models 1. Initial “mymodel.yaml” 2. “convnext_base.yaml” I follwed the hyperparamters same as convnext-base-224finetunedonImageInannotations Learning rat

2022년 10월 13일
·
0개의 댓글
·
post-thumbnail

[ML] Hyperparameter Tuning: Learning rate and Batch size

Batch size and Learning rate https://openreview.net/pdf?id=B1Yy1BxCZ 1. Batch Size small: converges quickly at the cost of noise in the training process large: converges slowly with accurate estimates of the error gradient 2. Learning

2022년 10월 13일
·
0개의 댓글
·
post-thumbnail

[ML] Various ways for Hyperparameter Tuning in Machine Learning

Hyperparameter Tuning The process of finding the right combination of hyperparameters to maximize the model performance Hyperparameter tuning methods Random Search Grid Search Each iteration tries a combination of hyperparameters in a specific order. It fits the model on each combination, records the model performance, and returns the best model with the best hyperparameters. Bayesian Optimization Tree-structured Parzen estimators(TPE) Hype

2022년 10월 13일
·
0개의 댓글
·

[GDSC-ML] Apply PyTorch template to Mnist classification

The second GDSC-ML session was to convert Jupyter Notebook of MNIST CNN model into Python scripts. Like most people, I was used to do ML projects through Jupyter notebook. It had a big advangtage that I can validate and check the code easily by just typing (Shift + Enter). But there are some fallbacks of Jupyter Notbooks in data science projects Unorganized hard to keep track of what I wr

2022년 10월 6일
·
0개의 댓글
·