캐글에 있는 데이터를 활용하여 고객 분류(신/구)를 위한 데이터 전처리와 feature enginnering 그리고 시계열 데이터 시각화를 진행할 것입니다.
더 나아가 EC3 / EC2를 활용해서 배포까지 하고 싶습니다.
Context
Typically e-commerce datasets are proprietary and consequently hard to find among publicly available data. However, The UCI Machine Learning Repository has made this dataset containing actual transactions from 2010 and 2011. The dataset is maintained on their site, where it can be found by the title "Online Retail".
Content
"This is a transnational data set which contains all the transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail.The company mainly sells unique all-occasion gifts. Many customers of the company are wholesalers."
Acknowledgements
Per the UCI Machine Learning Repository, this data was made available by Dr Daqing Chen, Director: Public Analytics group. chend '@' lsbu.ac.uk, School of Engineering, London South Bank University, London SE1 0AA, UK.
Image from stocksnap.io.
Inspiration
Analyses for this dataset could include time series, clustering, classification and more.
https://www.kaggle.com/datasets/carrie1/ecommerce-data
제가 다뤄볼 이번 데이터는 1년 동안(2010.12.01~2011.12.09) 4천명 정도의 고객이 구매했던 데이터를 모아둔 E-commerce dataset입니다.
새로운 고객과 기존 고객을 나누는 시도를 하려고 합니다.
!pip install pandas-profiling
!pip install missingno