Counter클래스

yebright·2025년 2월 12일

algorithm using Python

목록 보기

3/3

Counter Class

Counter 생성자는 인자로 iterable 혹은 매핑(mapping)형 객체를 받는다.
내부적으로, 해시 테이블 구조로 되어있는데, 각 요소와 개수를 key-value 쌍으로 가진다.
중복된 요소가 저장된 배열을 인자로 넘기면 각 요소가 나온 횟수가 저장된 객체를 얻게 된다.

>>> from collections import Counter
>>> Counter(["hi", "hey", "hi", "hi", "hello", "hey"])

Counter({'hi': 3, 'hey': 2, 'hello': 1})

Counter 생성자에 문자열을 인자로 넘기면 각 문자가 문자열에서 몇 번씩 나타나는지를 알려주는 객체가 반환된다.

>>> Counter("hello world")
Counter({'h': 1, 'e': 1, 'l': 3, 'o': 2, ' ': 1, 'w': 1, 'r': 1, 'd': 1})

사전처럼 사용가능
Counter를 사전처럼 사용할 수도 있다.
collections 모듈의 Counter 클래스는 파이썬의 기본 자료구조인 dictionary를 확장하고 있기 때문에, dictionary에서 제공하는 API를 그대로 사용할 수가 있다.

dictionary.items()
dictionary.keys()dictionary.values()
dictionary.values()
if key in dictionary:
	print(True)

등등

dictionary에서 in 연산자는 key가 존재하는지 확인해준다.
만약 key 가 존재하면 True를 반환하고 존재하지 않으면 False를 반환한다.

다른 iterable 객체들과 다르게 dictionary 자료형에 in 연산자를 사용하면 O(1)의 시간복잡도를 가지기 때문에 탐색에 매우 효율적이다. (hash table 이용)

그 외에 다음과 같은 연산도 가능하다.

>>> counter = Counter("hello world")
>>> counter["o"], counter["l"]
(2, 3)

특정 키에 해당하는 값을 갱신 가능.

>>> counter["l"] += 1
>>> counter["h"] -= 1
>>> counter
Counter({'h': 0, 'e': 1, 'l': 4, 'o': 2, ' ': 1, 'w': 1, 'r': 1, 'd': 1})

>>> if "o" in counter:
    	print("o in counter")
>>> del counter["o"]
>>> if "o" not in counter:
    	print("o not in counter")

o in counter
o not in counter

most_common()
갯수를 센 후, 가장 많이 등장한 element를 뽑을 땐 most_common 메소드를 사용하면 된다.

>>> print(Counter('hello world').most_common())

[('l', 3), ('o', 2), ('h', 1), ('e', 1), (' ', 1), ('w', 1), ('r', 1), ('d', 1)]

이 메서드의 인자로 숫자 k를 넘기면 그 숫자 만큼만 리턴하기 때문에, 가장 빈도수가 높은 k개의 element들을 넘겨준다. 예를 들어 k=2를 넘기면,

>>> print(Counter('hello world').most_common(2))
[('l', 3), ('o', 2)]

그래서 Leetcode의 Top K frequent Elements문제는

from collections import Counter
class Solution:
    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
        x = Counter(nums)
        listed=x.most_common(k)
        ans=[]
        #if listed is printed:
        #    output=[(index1,frequency1),(index2,frequency2)]
        for l in listed:
            ans.append(l[0])
        return ans

이렇게 간단한 답이 나올수가 있다.근데 most_common함수의 time complexity가 O(nlogn)이라고 하니, O(n)방법은 없나 찾아봐야겠다.

산술 연산자 활용
Counter는 마치 숫자처럼 산술 연산자를 사용할 수 있다.

예를 들어, 아래와 같이 2개의 카운터 객체가 있을 때,

counter1 = Counter(["A", "A", "B"])
counter2 = Counter(["A", "B", "B"])

이 두 객체를 더할 수도 있고,

>>> counter1 + counter2
Counter({'A': 3, 'B': 3})

이 두 객체를 뺄 수도 있다.

>>> counter1 - counter2
Counter({'A': 1})

defaultdict도 아래 링크 참조
https://velog.io/@mjieun/Algorithm-%ED%95%B4%EC%8B%9C%ED%85%8C%EC%9D%B4%EB%B8%94Hash-Table-Python

yebright

조합론, 강화학습 좋아하는 개발자

이전 포스트

Counter클래스

algorithm using Python

Map 함수- built-in Python

0개의 댓글