🌷 Iris 닀쀑뢄λ₯˜ 🌷

parkeuΒ·2022λ…„ 10μ›” 1일
0

ABCλΆ€νŠΈμΊ ν”„

λͺ©λ‘ 보기
39/55

🐼 문제 μ •μ˜

  • λ…λ¦½λ³€μˆ˜ : 4개 λ³€μˆ˜(꽃받침길이, κ½ƒλ°›μΉ¨λ„ˆλΉ„, κ½ƒμžŽκΈΈμ΄, κ½ƒμžŽλ„ˆλΉ„)
  • μ’…μ†λ³€μˆ˜ : μ•„μ΄λ¦¬μŠ€ ν’ˆμ’… 3가지(Iris Setosa, Iris Versicolour, Iris Virginica)
  • 데이터셋 총 수 150 (각 ν’ˆμ’… 별 50개 데이터)
  • 4개 λ³€μˆ˜(각 κ½ƒμ˜ 길이 λ„ˆλΉ„ λ“±)λ₯Ό λ…λ¦½λ³€μˆ˜λ‘œ 보고 μ•„μ΄λ¦¬μŠ€ ν’ˆμ’…μ„ λΆ„λ₯˜ν•˜λŠ” 닀쀑 λΆ„λ₯˜ 문제둜 μ •μ˜

라이브러리 μž„ν¬νŠΈ

import pandas as pd
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

import tensorflow as tf
import keras
from keras.models import Sequential
from keras.layers import Dense
from tensorflow.keras.optimizers import RMSprop

from sklearn import preprocessing
import warnings
warnings.filterwarnings('ignore')

from sklearn.model_selection import train_test_split

데이터 μ€€λΉ„

dataset_path = tf.keras.utils.get_file("iris.data", "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data")
- 원본 νŒŒμΌμ„ 보면 컬럼λͺ…이 λ‚˜μ™€μžˆμ§€ μ•ŠκΈ° λ•Œλ¬Έμ— 컬럼λͺ…을 지정해주고 μ—΄κΈ°
column_names = ['sepal length','sepal width','petal length','petal width','class']
raw_dataset = pd.read_csv(dataset_path, names=column_names)

dataset = raw_dataset.copy()
  • dataset

데이터 μ „μ²˜λ¦¬

Iris-setosa -> 0
Iris Versicolour -> 1
Iris-virginica -> 2

label_encoder = preprocessing.LabelEncoder()

l_class = label_encoder.fit_transform(dataset['class'])

dataset['class'] =l_class

데이터셋 생성

X = dataset[['sepal length','sepal width','petal length','petal width']]
y = dataset['class']
# ν•™μŠ΅ 데이터: 120건(80%), ν…ŒμŠ€νŠΈ 데이터: 30건(20%)
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2,random_state=7)

class 원-핫인코딩

from tensorflow.keras.utils import to_categorical
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
  • y_trai을 좜λ ₯해보면 λ‹€μŒκ³Ό 같이 μ μš©λ˜μ—ˆλ‹€.

λͺ¨λΈ ꡬ성

np.random.seed(7)

model = Sequential()
# μž…λ ₯λ‰΄λŸ° 4개, νΌμ…‰νŠΈλ‘  개수 16, ν™œμ„±ν™”ν•¨μˆ˜ relu
model.add(Dense(16, input_shape=(4, ), activation='relu'))
# 좜λ ₯λ‰΄λŸ° 3개, ν™œμ„±ν™”ν•¨μˆ˜ softmax
model.add(Dense(3, activation='softmax'))

λͺ¨λΈ ν•™μŠ΅

# μ†μ‹€ν•¨μˆ˜(loss): categorical_crossentropy , optimizer(ν•˜μ΄νΌνŒŒλΌλ―Έν„°) : Adam , metrics : accuracy
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

history = model.fit(X_train, y_train, epochs=200, batch_size=10)

λͺ¨λΈ 평가

scores = model.evaluate(X_test, y_test)
print("%s: %2.f%%" %(model.metrics_names[1], scores[1]*100))


πŸ‘©πŸ»β€πŸŽ“ [μ°Έκ³ ] κ΅μˆ˜λ‹˜μ½”λ“œ

πŸ“Š 차트

f, ax = plt.subplots(1,2, figsize=(12,6))
dataset['class'].value_counts().plot.pie(explode=None, autopct='%1.2f%%', ax=ax[0])
ax[0].set_title('iris class pie chart')
ax[0].set_ylabel('')

sns.countplot('class', data=dataset, ax=ax[1])
ax[1].set_title('Count of iris class')
ax[1].set_ylabel('')
plt.show()

πŸ”§ λ ˆμ΄λΈ”μ„ λ²”μ£Όν˜• ν˜•νƒœλ‘œ λ³€κ²½ν•˜λŠ” 방법

from sklearn.preprocessing import LabelEncoder
from keras.utils import np_utils

encoder = LabelEncoder()
encoder.fit(y)
Y_encodered = encoder.transform(y) # 라벨링
Y = np_utils.to_categorical(Y_encodered)
profile
배고파용.

0개의 λŒ“κΈ€