1. Scatterplot
1.1 seaborn.scatterplot
seaborn.scatterplot(data=None, x=None, y=None, hue=None, size=None, sizes=None, legend='auto')
- data: 적용할 데이터
- x, y: x, y축 값
- hue: 색상 나눌 값
- size: 사이즈 정할 값
- sizes: 사이즈의 범주
- legend: 범례 확인 표 유무
1.2 plotly.express.scatter
plotly.express.scatter(data_frame=None, x=None, y=None, color=None, size=None)
- data_frame: 적용할 데이터프레임
- x, y: x, y축 값
- color: 색상 나눌 값
- size: 사이즈 정할 값
2. scatterplot 예시
예시 1: 펭귄 데이터
df = sns.load_dataset('penguins')
df

numerical_columns = df.columns[2:6]
numerical_columns
sns.scatterplot(data=df, x=numerical_columns[0], y=numerical_columns[1], hue='species', size=numerical_columns[3], sizes=(10, 200))
plt.show()

sns.scatterplot(data=df, x=numerical_columns[0], y=numerical_columns[1], hue='sex', size=numerical_columns[3], sizes=(10, 200))
plt.show()

import plotly.express as px
fig = px.scatter(df.dropna(), x=numerical_columns[0], y=numerical_columns[1], color='species', size=numerical_columns[3])
fig.update_layout(width=1000, height=600)
fig.show()

fig = px.scatter(df.dropna(), x=numerical_columns[0], y=numerical_columns[1], color='species', size=numerical_columns[3])
fig.update_layout(width=1000, height=600)
fig.show()

예시 2: iris 데이터
df = sns.load_dataset('iris')
df

numerical_columns = df.columns[:4]
numerical_columns
sns.scatterplot(data=df, x=numerical_columns[0], y=numerical_columns[1], hue='species', size=numerical_columns[3], sizes=(10, 200))
plt.show()

fig = px.scatter(df, x=numerical_columns[0], y=numerical_columns[1], color='species', size=numerical_columns[3])
fig.update_layout(width=1000, height=600)
fig.show()

예시 3: 시계열 데이터 - 비행기 탑승자 데이터
df = sns.load_dataset('flights')
df

def monthToNum(month):
month_dict = {
'Jan': 1,
'Feb': 2,
'Mar': 3,
'Apr': 4,
'May': 5,
'Jun': 6,
'Jul': 7,
'Aug': 8,
'Sep': 9,
'Oct': 10,
'Nov': 11,
'Dec': 12
}
return month_dict.get(month)
df['month'] = df['month'].apply(monthToNum)
df

sns.scatterplot(data=df, x='year', y='month', hue='passengers', size='passengers', sizes=(10, 200), legend=False, palette='RdYlBu_r')
plt.show()

fig = px.scatter(df, x='year', y='month', color='passengers', size='passengers')
fig.update_layout(width=1000, height=600)
fig.show()
