Dacon 주최 Computer vision 이상치 탐지 알고리즘 경진대회
에서 MVTec Data를 기반으로 사물의 종류와 상태를 분류하는 task를 공부.
Computer vision data 이상치 탐지에서 시계열 data 이상치 탐지로 확장.
Data info가 정리해서 주어짐(csv).
train.csv는 file_name 별 class/state 정보를 제공. > supervised classification
index | file_name | class | state | label |
---|---|---|---|---|
0 | 10000.png | transistor | good | transistor-good |
1 | 10001.png | capsule | good | capsule-good |
...
f, axs = plt.subplots(1,1,figsize=(15,8))
for i, col in enumerate(['class']):
object_cnt = train_y[col].value_counts().sort_values(ascending=False)
axs.bar(object_cnt.index, object_cnt.values)
for x,y,z in zip(object_cnt.index, object_cnt.values, object_cnt.values/object_cnt.sum()*100):
axs.annotate('%d\n(%d%%)' %(int(y),z), xy = (x,y+10), textcoords='data', ha='center')
axs.axis(ymin=0, ymax=int(max(object_cnt)*1.1))
axs.set_xticklabels(object_cnt.index, rotation=75)
axs.set_title(col)
f.tight_layout()
plt.show()
f, axs = plt.subplots(3,5,figsize=(20,8))
axs = axs.flatten()
for i, col in enumerate(train_class.unique()):
axs[i].bar(class_cnt[col].index, class_cnt[col].values)
for x,y,z in zip(class_cnt[col].index, class_cnt[col].values, class_cnt[col].values/class_cnt[col].sum()*100):
axs[i].annotate('%d\n(%d%%)' %(int(y), z), xy=(x,y+10), textcoords='data', ha='center')
axs[i].set_xticklabels(class_cnt[col].index, rotation = 45)
axs[i].set_title(col)
f.tight_layout()
plt.show()
for col in (train_class.unique()):
fig, axs = plt.subplots(1, len(class_cnt[col].index), figsize=(20,20))
axs = axs.flatten()
for i, state in enumerate(class_cnt[col].index):
sample_row = train_y.loc[(train_y['class']==col)&(train_y['state']==state)].sample(1)
img = cv2.imread('/content/open/train/'+ sample_row['file_name'].iloc[0])
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
axs[i].imshow(img)
axs[i].set_title(str(col) +' '+ str(state))
plt.tight_layout()
plt.show()