[프로그래머스] 순위검색 - python

SUN·2022년 9월 21일

algorithm

목록 보기

15/30

오늘 풀어볼 문제는 순위검색이다.

풀이 과정

처음 이 문제를 보고 생각 한 건 해시 문제인가..? 였다.
그래서 해시에

hash["java"] = [1,2]  # hash["조건"] = info에서 조건 만족하는 사람의 인덱스들
hash["senior"] = [2,3] 
hash["cpp"] = [3,4]

이런 식으로 저장하고 예를들어서 java, senior 찾으라고 하면
어떤 리스트(lst)에다가 그 조건을 만족하는 사람들을 쭉 넣는다.
즉 hash["java"]랑 hash["senior"] 값을 lst에 쭉 넣어둔다.
그럼 lst = [1,2,2,3]가 될 거고 여기서 각 사람들이 몇 번 등장했는 지 세준다.
그리고 조건 개수대로 등장한 사람이 몇명인 지 세주면 된다.
즉 2번 사람은 2번 1번 사람은 1번 3번 사람은 1번 나왔으니까 1명이 조건을 만족한다.

이 알고리즘 대로 짜보았다.

실패 코드 1

from collections import Counter

def solution(info, query):
    answer = []

    info_hash = {}
    scores = {}

    for member, info_string in enumerate(info):
        for idx, info_word in enumerate(info_string.split()):
            if idx == 4:
                scores[member] = int(info_word)
                continue

            if info_hash.get(info_word):
                info_hash[info_word].append(member)

            else:
                info_hash[info_word] = [member]

    scores = sorted(scores.items(), key=lambda x: x[1], reverse=True)

    for q in query:
        splited_q = q.replace(" and", "").split()
        min_score = int(splited_q.pop())
        answer_count = 5

        matched = []
        for member, score in scores:
            if score < min_score:
                break
            matched.append(member)

        for q_elem in splited_q:
            if q_elem == "-":
                answer_count -= 1
                continue

            matched.extend(info_hash.get(q_elem, []))

        matched_counts = Counter(matched)

        temp_count = 0
        for member, matched_count in matched_counts.items():
            if matched_count == answer_count:
                temp_count += 1

        answer.append(temp_count)

    return answer

결과는 정확성만 통과 효율성 모조리 실패...

아무래도 전체 세주는 부분에서 시간이 많이 걸리는 것 같아서
lst를 set으로 바꾸고 겹치는 사람들만 계속 lst에 넣어주는 방식으로 변경해보았다.

실패 코드2

from collections import Counter

def solution(info, query):
    answer = []

    info_hash = {}
    scores = {}

    for member, info_string in enumerate(info):
        for idx, info_word in enumerate(info_string.split()):
            if idx == 4:
                scores[member] = int(info_word)
                continue

            if info_hash.get(info_word):
                info_hash[info_word].add(member)

            else:
                info_hash[info_word] = {member}

    scores = sorted(scores.items(), key=lambda x: x[1], reverse=True)

    for q in query:
        splited_q = q.replace(" and", "").split()
        min_score = int(splited_q.pop())
        answer_count = 5

        matched = set()
        
        for member, score in scores:
            if score < min_score:
                break
            matched.add(member)

        for q_elem in splited_q:
            if q_elem == "-":
                answer_count -= 1
                continue
            
            # 둘의 교집합 계산해서 넣기
            matched = matched & info_hash.get(q_elem, [])
            
        answer.append(len(matched))

    return answer

이것도 효율성 실패

나는 도저히 모르겠다 하고 다른 사람의 풀이를 봤다.
해시문제긴 한데, 나처럼 쓰는 게 아니라
나올 수 있는 모든 조합을 만들고 이분탐색으로 찾는 식으로 푸는 거였다.

해시 문제다 라는 것까지는 접근했다는 것에... 위안을 삼는다

다른 사람의 풀이

조건들의 조합을 통해서 나올 수 있는 모든 조합을 구한다.
1번에서 구한 각 조건 조합을 key로하고 그 조건 조합의 점수 리스트를 value로 갖는 해시맵을 만든다.
2의 해시맵을 정렬한다.
각 쿼리들을 해시맵에 key로 검색한다.
4번 결과의 value(해당 조건을 만족하는 사람들의 점수 리스트)에서 쿼리에 있는 점수로 이분 탐색(lower bound)을 해서 인덱스를 찾는다.
5에서 찾은 그 인덱스 이상인 점수들의 개수가 조건을 만족하는 사람들의 수가 된다.

참고로 lower bound 알고리즘은 이분탐색에서 파생된 알고리즘으로
특정 값이상의 값이 처음 나오는 위치를 찾는 알고리즘이라고 한다.

이분탐색 구현 풀이

from collections import defaultdict
from itertools import combinations

def solution(info, query):
    answer = []

    info_hash = defaultdict(list)

    for i in info:
        splited_info = i.split()
        score = splited_info.pop()
        for r in range(5):
            combs = combinations(range(4), r)

            for comb in combs:
                key = splited_info[:]
                for elem in comb:
                    key[elem] = "-"

                info_hash[" ".join(key)].append(int(score))

    for item in info_hash:
        info_hash[item].sort()

    for q in query:
        splited_q = q.replace(" and", "").split()
        target_score = int(splited_q.pop())

        target_key = " ".join(splited_q)

        matched_score_list = info_hash[target_key]
		
        if not matched_score_list:
            answer.append(0)
            continue

        score_len = len(matched_score_list)
		
        #upper bound 알고리즘
        start = 0
        end = score_len
        mid = (start + end) // 2

        while start < end:
            mid = (start + end) // 2

            mid_score = matched_score_list[mid]

            if mid_score < target_score:
                start = mid + 1

            else:
                end = mid

        answer.append(score_len - start)

    return answer

결과는 모두 통과

참고로 밑에 upper_bound 알고리즘은 아래처럼 라이브러리 사용해서 간단히 바꿀 수 있다.

라이브러리 사용한 풀이

from collections import defaultdict
from bisect import bisect_left
from itertools import combinations

def solution(info, query):
    answer = []

    info_hash = defaultdict(list)

    for i in info:
        splited_info = i.split()
        score = splited_info.pop()
        for r in range(5):
            combs = combinations(range(4), r)

            for comb in combs:
                key = splited_info[:]
                for elem in comb:
                    key[elem] = "-"

                info_hash[" ".join(key)].append(int(score))

    for item in info_hash:
        info_hash[item].sort()

    for q in query:
        splited_q = q.replace(" and", "").split()
        target_score = int(splited_q.pop())

        target_key = " ".join(splited_q)

        matched_score_list = info_hash[target_key]

        if not matched_score_list:
            answer.append(0)
            continue

        score_len = len(matched_score_list)
		
        # upper bound 알고리즘
        start = bisect_left(matched_score_list, target_score)
        
        answer.append(score_len - start)

    return answer