-참고블그 : https://wegonnamakeit.tistory.com/46
dropout = torch.nn.Dropout(p=drop_prob)
model = torch.nn.Sequential(linear1, relu, dropout, linear2, relu, dropout, linear3, relu, dropout).to(device)
all_linear1_params = torch.cat([x.view(-1) for x in model.linear1.parameters()])
all_linear2_params = torch.cat([x.view(-1) for x in model.linear2.parameters()])
l1_regularization = lambda1 * torch.norm(all_linear1_params, 1)
l2_regularization = lambda2 * torch.norm(all_linear2_params, 2)
loss = cross_entropy_loss + l1_regularization + l2_regularization
optimizer = torch.optim.Adam(model.parameters(), lr=1e-4, weight_decay=1e-5
Xavier 초기화(Sigmoid 최적화)
torch.nn.init.xavieruniform(linear1.weight) = 이산확률분포
torch.nn.init.xaviernormal(linear1.weight) = 연속확률분포, 정규분포
He 초기화 (ReLu에 최적화)
torch.nn.init.kaiminguniform(tensor, a=0, mode='fanin', nonlinearity='leaky_relu')
torch.nn.init.kaiming_normal(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu')
참고자료 https://wegonnamakeit.tistory.com/47
bn1 = torch.nn.BatchNorm2d(32)
bn2 = torch.nn.BatchNorm2d(32)
bn_model = torch.nn.Sequential(linear1, bn1, relu,linear2, bn2, relu, linear3).to(device)
입력과 출력이 다른 현상.!을 해결하고자
-internal Covarience Shift
- Layer/Activation 마다 입력의 분포가 달라지는 현상
-Whitening기법(특징없애기? 줄이기?)
모델 구성
# 1번 레이어 : 합성곱층(Convolutional layer)
합성곱(in_channel = 1, out_channel = 32, kernel_size=3, stride=1, padding=1) + 활성화 함수 ReLU
맥스풀링(kernel_size=2, stride=2))
# 2번 레이어 : 합성곱층(Convolutional layer)
합성곱(in_channel = 32, out_channel = 64, kernel_size=3, stride=1, padding=1) + 활성화 함수 ReLU
맥스풀링(kernel_size=2, stride=2))
# 3번 레이어 : 전결합층(Fully-Connected layer)
특성맵을 펼친다. # batch_size × 7 × 7 × 64 → batch_size × 3136
전결합층(뉴런 10개) + 활성화 함수 Softmax