없음. 다만 참조하고 있는 "생성형 AI학습 데이터 공개해야" https://www.hani.co.kr/arti/economy/economy_general/1128825.html 기사는 읽을 만함.
VGGNet: https://arxiv.org/abs/1409.1556
ResNet: https://arxiv.org/abs/1512.03385
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization: https://arxiv.org/abs/1610.02391
Grad-CAM++ : https://arxiv.org/abs/1710.11063
mixup: Beyond Empirical Risk Minimization: https://arxiv.org/abs/1710.09412
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features: https://arxiv.org/abs/1905.04899
Fully_Convolutional Network: https://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Long_Fully_Convolutional_Networks_2015_CVPR_paper.pdf
R-CNN : https://arxiv.org/abs/1311.2524
focal_Loss(RetinaNet) :https://arxiv.org/abs/1708.02002
Panoptic Segmentation: https://arxiv.org/abs/1801.00868
Uni-DVPS : (비공개 자료.-> 학교 도서관으로 access해서 받으면됨.)
Grounded SAM : https://arxiv.org/abs/2401.14159
Real-World Single Image Super-Resolution: A New Benchmark and A New Model (https://arxiv.org/abs/1904.00523)
Real-World Blur Dataset for Learning and Benchmarking Deblurring Algorithms (https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123700188.pdf)
Blind Super-Resolution Kernel Estimation using an Internal-GAN (https://arxiv.org/abs/1909.06581)
(Motion estimation 최신 논문) SpatialTracker: Tracking Any 2D Pixels in 3D Space (https://arxiv.org/abs/2404.04319)
RealBlur : https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123700188.pdf
SRGAN : https://arxiv.org/abs/1609.04802
learning_blind vidio temporal consistency : https://openaccess.thecvf.com/content_ECCV_2018/papers/Wei-Sheng_Lai_Real-Time_Blind_Video_ECCV_2018_paper.pdf
CLIP huggingface implementation: https://github.com/huggingface/transformers/blob/main/src/transformers/models/clip/modeling_clip.py
ImageBIND official implementation: https://github.com/facebookresearch/ImageBind
LanguageBIND: https://arxiv.org/abs/2310.01852
zeroshot cap : https://arxiv.org/abs/2111.14447
styleGAN : https://arxiv.org/pdf/1812.04948
imageBIND : https://arxiv.org/abs/2305.05665
DALE-e :https://arxiv.org/abs/2204.06125
Flamingo pytorch implementation: https://github.com/lucidrains/flamingo-pytorch/blob/main/flamingo_pytorch/flamingo_pytorch.py
Flamingo : https://arxiv.org/abs/2204.14198
LLaVA: https://llava-vl.github.io/
Q-Former 학습 보충 논문 (순서대로 읽기 권장)
-- BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation (https://arxiv.org/abs/2201.12086)
-- BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models (https://arxiv.org/abs/2301.12597)
VisualProgramming:https://arxiv.org/abs/2211.11559
LDM (Stable Diffusion): https://arxiv.org/abs/2112.10752
pixelRNN :https://arxiv.org/abs/1601.06759
Stable Diffusion : https://arxiv.org/abs/2112.10752
ControlNet :https://arxiv.org/abs/2302.05543
Marigold(Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation):
https://arxiv.org/abs/2312.02145
[3D MACHINE LEARNING] - 3D DATA REPRESENTATIONS: https://www.antoinetlc.com/blog-summary/3d-data-representations
Mesh R-CNN : https://arxiv.org/abs/1906.02739
3DGS(Gaussian): https://arxiv.org/abs/2308.04079
DreamFusion: https://arxiv.org/abs/2209.14988
Structure from mothion revisited:
https://openaccess.thecvf.com/content_cvpr_2016/papers/Schonberger_Structure-From-Motion_Revisited_CVPR_2016_paper.pdf
shapeNet : https://arxiv.org/abs/1512.03012
Pixel2Mesh: Generatubg 3D mesh models from single RGB images
https://arxiv.org/abs/1804.01654
Loper et al., SMPL: A Skinned Multi-Person Linear Model: SIGGRAPH 2015.
https://dl.acm.org/doi/10.1145/2816795.2818013
Bogo et al., Keep it SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image: ECCV 2016.
https://arxiv.org/abs/1607.08128
Anguelov et al., SCAPE: Shape Completion and Animation of People: SIGGRAPH 2005.
https://dl.acm.org/doi/10.1145/1073204.1073207
Fast R-CNN
https://arxiv.org/abs/1504.08083
Faster R-CNN
https://arxiv.org/abs/1506.01497
Masked R-CNN
https://arxiv.org/abs/1703.06870
Live2Diffusion
https://arxiv.org/pdf/2407.08701
Meta-learning
https://arxiv.org/pdf/1703.03400
Gasuian Misxture
https://arxiv.org/pdf/1711.06929
Multilabel Image Classification Using Deep Learning
https://www.mathworks.com/help/deeplearning/ug/multilabel-image-classification-using-deep-learning.html
Fine-Grained Image Analysis with Deep Learning: A Survey
https://arxiv.org/abs/2111.06119
Few shot Classification
https://arxiv.org/pdf/2303.07502
one shot Classification
https://www.cs.cmu.edu/~rsalakhu/papers/oneshot1.pdf
skin color -based classification
https://arxiv.org/pdf/1708.02694
A survey on Image Data Augmentation for Deep learning
https://journalofbigdata.springeropen.com/articles/10.1186/s40537-019-0197-0
AutoAugment: Learning Augmentation Policies from Data
https://arxiv.org/abs/1805.09501
RandAugment: Practical automated data augmentation with a reduced search spac
https://arxiv.org/abs/1909.13719
A ConvNet for the 2020s
https://arxiv.org/abs/2201.03545
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
https://arxiv.org/abs/2010.11929
CoAtNet: Marrying Convolution and Attention for All Data Sizes
https://arxiv.org/abs/2106.04803
ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases
https://arxiv.org/abs/2103.10697
CNN for NLP
https://emnlp2014.org/papers/pdf/EMNLP2014181.pdf
DataPerf: Benchmarks for Data-Centric AI Development
https://arxiv.org/abs/2207.10062
Expectation-Maximization Attention Networks for Semantic Segmentation
https://openaccess.thecvf.com/content_ICCV_2019/papers/Li_Expectation-Maximization_Attention_Networks_for_Semantic_Segmentation_ICCV_2019_paper.pdf
Generalized Video Deblurring for Dynamic Scenes
https://arxiv.org/abs/1507.02438
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
https://arxiv.org/abs/1701.06538
From Sparse to Soft Mixtures of Experts
https://arxiv.org/abs/2308.00951
다 논문을 리뷰하기는 어렵지만 그래도 훌륭한 논문들이므로 왠만하면 다 읽어보자..!
하다보니까 논문을 안 읽고는 이해하기가 힘들겠는데..? 강의만으로는 이해하기 불가한거 아닌가 ㅋㅋ
Grad-CAM
https://arxiv.org/abs/1610.02391
Grad-CAM++
https://arxiv.org/abs/1710.11063
Saliency map
https://arxiv.org/abs/1312.6034
igos++
https://arxiv.org/abs/2012.15783
코드:https://github.com/khorrams/IGOS_pp