필요 데이터 변경하기
데이터 추가하기
import pandas as pd
flight = pd.read_csv('/Clean_Dataset.csv', encoding='cp949')
flight['price2'] = flight['price']*2
flight['price3'] = flight['price'] + flight['price2']
flight.head()
Unnamed: 0 airline flight source_city departure_time stops arrival_time destination_city class duration days_left price price2 price3
0 0 SpiceJet SG-8709 Delhi Evening zero Night Mumbai Economy 2.17 1 5953 11906 17859
1 1 SpiceJet SG-8157 Delhi Early_Morning zero Morning Mumbai Economy 2.33 1 5953 11906 17859
2 2 AirAsia I5-764 Delhi Early_Morning zero Early_Morning Mumbai Economy 2.17 1 5956 11912 17868
3 3 Vistara UK-995 Delhi Morning zero Afternoon Mumbai Economy 2.25 1 5955 11910 17865
4 4 Vistara UK-963 Delhi Morning zero Morning Mumbai Economy 2.33 1 5955 11910 17865
데이터를 원하는 위치에 추가하기
df.insert(loc,column,value,allow_duplicates=False)
-loc : 삽입될 열의 위치
-column : 삽입될 열의 이름
-value:삽입될 열의 값
-allow_duplicates : True일 경우 중복 열의 삽입 허용
import pandas as pd
flight = pd.read_csv('/Clean_Dataset.csv', encoding='cp949')
flight.insert(10, 'duration2', flight['duration']*2)
flight
데이터 삭제하기
axis = 1 : 열을 기준으로 데이터를 삭제
axis = 0 : 행을 기준으로 데이터를 삭제
inplace : True일 경우 원본 데이터에서 지움
열을 기준으로 삭제
import pandas as pd
flight = pd.read_csv('/Clean_Dataset.csv', encoding='cp949')
flight.drop('price', axis=1).head()
Unnamed: 0 airline flight source_city departure_time stops arrival_time destination_city class duration days_left
0 0 SpiceJet SG-8709 Delhi Evening zero Night Mumbai Economy 2.17 1
1 1 SpiceJet SG-8157 Delhi Early_Morning zero Morning Mumbai Economy 2.33 1
2 2 AirAsia I5-764 Delhi Early_Morning zero Early_Morning Mumbai Economy 2.17 1
3 3 Vistara UK-995 Delhi Morning zero Afternoon Mumbai Economy 2.25 1
4 4 Vistara UK-963 Delhi Morning zero Morning Mumbai Economy 2.33 1
행을 기준으로 삭제
import pandas as pd
flight = pd.read_csv('/Clean_Dataset.csv', encoding='cp949')
flight.drop('price', axis=0).head()
Unnamed: 0 airline flight source_city departure_time stops arrival_time destination_city class duration days_left price
1 1 SpiceJet SG-8157 Delhi Early_Morning zero Morning Mumbai Economy 2.33 1 5953
2 2 AirAsia I5-764 Delhi Early_Morning zero Early_Morning Mumbai Economy 2.17 1 5956
3 3 Vistara UK-995 Delhi Morning zero Afternoon Mumbai Economy 2.25 1 5955
4 4 Vistara UK-963 Delhi Morning zero Morning Mumbai Economy 2.33 1 5955
5 5 Vistara UK-945 Delhi Morning zero Afternoon Mumbai Economy 2.33 1 5955
원본 데이터 삭제
삭제 후 데이터 프레임에 저장
import pandas as pd
flight = pd.read_csv('/Clean_Dataset.csv', encoding='cp949')
flight = flight.drop(index=0, axis=0)
flight.head()
inplace 옵션 사용
import pandas as pd
flight = pd.read_csv('/Clean_Dataset.csv', encoding='cp949')
flight.drop(index=0, axis=0, inplace = True).head()
Unnamed: 0 airline flight source_city departure_time stops arrival_time destination_city class duration days_left price
1 1 SpiceJet SG-8157 Delhi Early_Morning zero Morning Mumbai Economy 2.33 1 5953
2 2 AirAsia I5-764 Delhi Early_Morning zero Early_Morning Mumbai Economy 2.17 1 5956
3 3 Vistara UK-995 Delhi Morning zero Afternoon Mumbai Economy 2.25 1 5955
4 4 Vistara UK-963 Delhi Morning zero Morning Mumbai Economy 2.33 1 5955
5 5 Vistara UK-945 Delhi Morning zero Afternoon Mumbai Economy 2.33 1 5955
칼럼명 변경하기
변수.rename(columns = {'기존칼럼명':'변경 칼럼명', ...})
import pandas as pd
flight = pd.read_csv('/Clean_Dataset.csv', encoding='cp949')
flight = flight.rename(columns = {'airline' : 'airline_name'})
flight.head()
Unnamed: 0 airline_name flight source_city departure_time stops arrival_time destination_city class duration days_left price
0 0 SpiceJet SG-8709 Delhi Evening zero Night Mumbai Economy 2.17 1 5953
1 1 SpiceJet SG-8157 Delhi Early_Morning zero Morning Mumbai Economy 2.33 1 5953
2 2 AirAsia I5-764 Delhi Early_Morning zero Early_Morning Mumbai Economy 2.17 1 5956
3 3 Vistara UK-995 Delhi Morning zero Afternoon Mumbai Economy 2.25 1 5955
4 4 Vistara UK-963 Delhi Morning zero Morning Mumbai Economy 2.33 1 5955
데이터프레임 정렬하기
변수.sort_values(by = '칼럼 명', ascending = True | False)
import pandas as pd
flight = pd.read_csv('/Clean_Dataset.csv', encoding='cp949')
flight = flight.rename(columns = {'airline' : 'airline_name'}).sort_values(by='price',ascending=True)
flight.head()
Unnamed: 0 airline_name flight source_city departure_time stops arrival_time destination_city class duration days_left price
205012 205012 Indigo 6E-605 Chennai Afternoon one Evening Hyderabad Economy 4.75 31 1105
205754 205754 Indigo 6E-605 Chennai Afternoon one Night Hyderabad Economy 10.08 39 1105
205024 205024 Indigo 6E-6137 Chennai Morning one Evening Hyderabad Economy 8.83 31 1105
204736 204736 AirAsia I5-517 Chennai Morning zero Morning Hyderabad Economy 1.17 28 1105
205023 205023 Indigo 6E-6113 Chennai Afternoon one Night Hyderabad Economy 8.67 31 1105