2877. Create a DataFrame from List

Write a solution to create a DataFrame from a 2D list called student_data. This 2D list contains the IDs and ages of some students.

The DataFrame should have two columns, student_id and age, and be in the same order as the original 2D list.

The result format is in the following example.

student_data:
[ 
[1, 15],
[2, 11],
[3, 11],
[4, 20]
]

 

정답은,

import pandas as pd
def createDataframe(student_data: List[List[int]] ->pd.DataFrame:
	df= pd.DataFrame(student_data, columns=['student_id', 'age'])
    return df

 

2878. Get the Size of a DataFrame

DataFrame players:

Column NameType
player_idint
nameobject
ageint
positionobject
......

Write a solution to calculate and display the number of rows and columns of players.

Return the result as an array:

[number of rows, number of columns]

The result format is in the following example.

 

정답은,

import pandas as pd
def getDataframeSize(players: pd.DataFrame) -> List[int]:
	return [players.shape[0], players.shape[1]]

 

2880. Select Data

DataFrame students

Column NameType
student_idint
nameobject
ageint

Write a solution to select the name and age of the student with student_id = 101.

The result format is in the following example.

 

정답은

import pandas as pd
def selectData(students: pd.DataFrame) -> pd.DataFrame:
	students[students['student_id'==101]][['name','age']]

 

2882. Drop Duplicate Rows

DataFrame customers

Column NameType
customer_idint
nameobject
emailobject

There are some duplicate rows in the DataFrame based on the email column.

Write a solution to remove these duplicate rows and keep only the first occurrence.

The result format is in the following example.

Example 1:
Input:

customer_idnameemail
1Ellaemily@example.com
2Davidmichael@example.com
3Zacharysarah@example.com
4Alicejohn@example.com
5Finnjohn@example.com
6Violetalice@example.com

Output:

customer_idnameemail
1Ellaemily@example.com
2Davidmichael@example.com
3Zacharysarah@example.com
4Alicejohn@example.com
6Violetalice@example.com

Explanation:
Alic (customer_id = 4) and Finn (customer_id = 5) both use john@example.com, so only the first occurrence of this email is retained.

 

정답은,

import pandas as pd
def dropDuplicateEmails(customers: pd.DataFrame) -> pd.DataFrame:
	df=pd.Dataframe(customers)
    df.drop_duplicates(subset='email', keep='first', inplace=True)
    return df

 

2883. Drop Missing Data

DataFrame students

Column NameType
student_idint
nameobject
ageint

There are some rows having missing values in the name column.

Write a solution to remove the rows with missing values.

The result format is in the following example.

Example 1:

Input:

student_idnameage
32Piper5
217None19
779Georgia20
849Willow14

Output:

student_idnameage
32Piper5
779Georgia20
849Willow14

Explanation:
Student with id 217 havs empty value in the name column, so it will be removed.

 

정답은,

import pandas as pd
def dropMissingData(students: pd.DataFrame) -> pd.DataFrame:
    df=pd.DataFrame(students)
    df.dropna(subset=['name'], inplace=True)
    return df

 

2885. Rename Columns

DataFrame students

Column NameType
idint
firstobject
lastobject
ageint

Write a solution to rename the columns as follows:

id to student_id
first to first_name
last to last_name
age to age_in_years
The result format is in the following example.

Example 1:
Input:

idfirstlastage
1MasonKing6
2AvaWright7
3TaylorHall16
4GeorgiaThompson18
5ThomasMoore10

Output:

student_idfirst_namelast_nameage_in_years
1MasonKing6
2AvaWright7
3TaylorHall16
4GeorgiaThompson18
5ThomasMoore10

Explanation:
The column names are changed accordingly.

 

정답은,

import pandas as pd
def renameColumns(students: pd.DataFrame) -> pd.DataFrame:
    df = pd.DataFrame(students)
    df.rename(columns={'id':'student_id', 'first':'first_name', 'last':'last_name', 'age':'age_in_years'}, inplace=True)
    return df

 

2886. Change Data Type

DataFrame students

Column NameType
student_idint
nameobject
ageint
gradefloat

Write a solution to correct the errors:

The grade column is stored as floats, convert it to integers.

The result format is in the following example.

Example 1:
Input:
DataFrame students:

student_idnameagegrade
1Ava673.0
2Kate1587.0

Output:

student_idnameagegrade
1Ava673
2Kate1587

Explanation:
The data types of the column grade is converted to int.

 

정답은,

import pandas as pd
def changeDatatype(students: pd.DataFrame) -> pd.DataFrame:
    students['grade']=students['grade'].astype(int)
    return students

 

2887. Fill Missing Data

DataFrame products

Column NameType
nameobject
quantityint
priceint

Write a solution to fill in the missing value as 0 in the quantity column.

The result format is in the following example.

Example 1:
Input:

namequantityprice
WristwatchNone135
WirelessEarbudsNone821
GolfClubs7799319
Printer8493051

Output:

namequantityprice
Wristwatch0135
WirelessEarbuds0821
GolfClubs7799319
Printer8493051

Explanation:
The quantity for Wristwatch and WirelessEarbuds are filled by 0.

 

정답은,

import pandas as pd
def fillMissingValues(products: pd.DataFrame) -> pd.DataFrame:
    products['quantity'].fillna(0, inplace=True)
    return products

 

2889. Reshape Data: Pivot

DataFrame weather

Column NameType
cityobject
monthobject
temperatureint

Write a solution to pivot the data so that each row represents temperatures for a specific month, and each city is a separate column.

The result format is in the following example.

Example 1:
Input:

citymonthtemperature
JacksonvilleJanuary13
JacksonvilleFebruary23
JacksonvilleMarch38
JacksonvilleApril5
JacksonvilleMay34
ElPasoJanuary20
ElPasoFebruary6
ElPasoMarch26
ElPasoApril2
ElPasoMay43

Output:

monthElPasoJacksonville
April25
February623
January2013
March2638
May4334

Explanation:
The table is pivoted, each column represents a city, and each row represents a specific month.

 

정답은,

import pandas as pd
def pivotTable(weather: pd.DataFrame) -> pd.DataFrame:
    return weather.pivot(index='month', columns='city', values='temperature')

 

2890. Reshape Data: Melt

DataFrame report

Column NameType
productobject
quarter_1int
quarter_2int
quarter_3int
quarter_4int

Write a solution to reshape the data so that each row represents sales data for a product in a specific quarter.

The result format is in the following example.

Example 1:

Input:

productquarter_1quarter_2quarter_3quarter_4
Umbrella417224379611
SleepingBag80093693875

Output:

productquartersales
Umbrellaquarter_1417
SleepingBagquarter_1800
Umbrellaquarter_2224
SleepingBagquarter_2936
Umbrellaquarter_3379
SleepingBagquarter_393
Umbrellaquarter_4611
SleepingBagquarter_4875

Explanation:
The DataFrame is reshaped from wide to long format. Each row represents the sales of a product in a quarter.

 

정답은,

import pandas as pd
def meltTable(report: pd.DataFrame) -> pd.DataFrame:
    return pd.melt(report, id_vars=['product'], var_name='quarter', value_name='sales')

 

2891. Method Chaining

DataFrame animals

Column NameType
nameobject
speciesobject
ageint
weightint

Write a solution to list the names of animals that weigh strictly more than 100 kilograms.

Return the animals sorted by weight in descending order.

The result format is in the following example.

Example 1:

Input:
DataFrame animals:

namespeciesageweight
TatianaSnake98464
KhaledGiraffe5041
AlexLeopard6328
JonathanMonkey45463
StefanBear10050
TommyPanda26349

Output:

name
Tatiana
Jonathan
Tommy
Alex

Explanation:
All animals weighing more than 100 should be included in the results table.
Tatiana's weight is 464, Jonathan's weight is 463, Tommy's weight is 349, and Alex's weight is 328.
The results should be sorted in descending order of weight.

In Pandas, method chaining enables us to perform operations on a DataFrame without breaking up each operation into a separate line or creating multiple temporary variables.

Can you complete this task in just one line of code using method chaining?

 

정답은,

import pandas as pd
def findHeavyAnimals(animals: pd.DataFrame) -> pd.DataFrame:
    return animals[animals['weight']>100].sort_values('weight', ascending=False)[['name']]
profile
데이터 분석 좀 제대로 하려고 하는 비전공자의 기록일지

0개의 댓글