[캐글] Courses - Pandas(3)

HO94·2021년 7월 1일
0

캐글

목록 보기
11/17

3. Summary Functions and Maps

  • summary functions
    - describe( )
    - mean( )
    - unique( )
    - value_count( )
  • map

1번

What is the median of the points column in the reviews DataFrame?

median_points = reviews.points.median()

2번

What countries are represented in the dataset? (Your answer should not include any duplicates.)

first_description = reviews['description'][0]

3번

How often does each country appear in the dataset? Create a Series reviews_per_country mapping countries to the count of reviews of wines from that country.

reviews_per_country = reviews.country.value_counts()

4번

Create variable centered_price containing a version of the price column with the mean price subtracted.

centered_price = reviews.price - reviews.price.mean()

5번

I'm an economical wine buyer. Which wine is the "best bargain"? Create a variable bargain_wine with the title of the wine with the highest points-to-price ratio in the dataset.

points_to_price_ratio = (reviews.points / reviews.price).idxmax()
bargain_wine = reviews.loc[points_to_price_ratio, "title"]

6번

There are only so many words you can use when describing a bottle of wine. Is a wine more likely to be "tropical" or "fruity"? Create a Series descriptor_counts counting how many times each of these two words appears in the description column in the dataset.

n_trop = reviews.description.map(lambda desc: "tropical" in desc).sum()
n_fruity = reviews.description.map(lambda desc: "fruity" in desc).sum()
descriptor_counts = pd.Series([n_trop, n_fruity], index=['tropical', 'fruity'])

0개의 댓글