iterrows와 enumerate, 비슷해보이는데 언제 써야 할까?

소리·2023년 10월 18일

파이썬

제로베이스 데이터분석 공부

목록 보기

40/84

🤔 인덱스와 값을 모두 반환한다는 점에서 비슷한 느낌이 많이 들어서 코드를 찍을 때 구별을 못했다. 오늘 정리해보자아!

🔎 iterator는 반복가능한 iterable한 객체에 순차적으로 요소를 반환하는 객체이다. 모든 요소를 한 번에 로딩하지 않고, 필요할 때 마다 요소를 생성하고 반환하기 때문에 메모리를 절약할 수 있다.

enumerate

iterable 객체를 받아 인덱스와 해당 요소를 반환하는 iterator를 생성한다. 여기서 객체란 데이터프레임 각 행의 정보를 담았다고 볼 수 있다.

#enumerate(iterable, start=0)

alphabet =['A', 'B', 'C']

for idx, val in enumrate(alphabet):
	print(idx, val)


>>> 0 A
1 B
2 C

iterrows()

데이터프레임에서 각 행을 반복문을 사용해 출력하는 방법

import pandas as pd

dict_1 = {
    'col1': [4, 1, 5, 3, 2],
    'col2': [6, 7, 8, 9, 10],
    'col3': [11, 12, 13, 14, 15],
    'col4': [16, 17, 18, 19, 20]
}

df_1 = pd.DataFrame(dict_1)

for row in df_1.iterrows():
    print(row)



-- Result
(0, col1     4
col2     6
col3    11
col4    16
Name: 0, dtype: int64)

(1, col1     1
col2     7
col3    12
col4    17
Name: 1, dtype: int64)

(2, col1     5
col2     8
col3    13
col4    18
Name: 2, dtype: int64)

(3, col1     3
col2     9
col3    14
col4    19
Name: 3, dtype: int64)

(4, col1     2
col2    10
col3    15
col4    20
Name: 4, dtype: int64)

#type
(index, row_series)

예시출처

for loop가 돌면서 tuple의 형태로 출력되고,
첫 번째 데이터 자리는 index, 두번째는 index가 가지는 df의 행 정보가 Series 형태로 들어있다.

추가 설명이 필요하다면

zip()

여러 개 iterable 객체를 묶어서 각각의 iterable 객체에서 하나씩 원소를 가져와, tuple로 묶은 iterator를 생성한다. (앞 함수와 반대 작용)

num = [1, 2, 3, 4, 5]
alp = ['a', 'b', 'c', 'd', 'e']

for item in zip(num, alp):
	print(item)
    
>>> (1, 'a')
(2, 'b')
(3, 'c')
(4, 'd')
(5, 'e')

next()

iterator 객체에서 다음 요소를 반환, 더이상 없을 경우 None 반환
원하는 문구가 있을 경우 next(iterator, '문자열') 로 가능

iter()

iterator 객체로 반환 가능

data = iter(range(1,5))

print(next(data)) #1
print(next(data)) #2
print(next(data)) #3
print(next(data)) #4
print(next(data)) #stopIteration 예외 발생

소리

데이터로 경로를 탐색합니다.

이전 포스트

HTML | 헷갈렸던 html 요소 불러오는 방법 정리 (select, find)

다음 포스트