[노씨데브 킬링캠프] 1주차 - 문제풀이: ★후보키★

KissNode·2024년 1월 8일

노씨데브 킬링캠프

목록 보기

13/73

★후보키★(100/100)

★다시 풀어봐야할 문제★ (3차시도만에 통과. 다음에는 비트연산자, 시프트연산자, Set 을 활용해서 다시 풀어보기)

접근 방법 [필수 작성]

제한 조건 확인

모든 튜플은 유일한 값 → 중복에 대한 우려 x
최대 20*8 의 table

아이디어

릴레이션의 속성을 조합한 후, 각 튜플들이 유일하게 구분될 수 있는지 파악한다.

속성의 조합 가능한 갯수는 2^8 (조합에 포함 될지,포함 안될지) 이며,

각 조합에 대해서 각각의 모든 튜플들이 서로 상이한지 확인해야한다. 20C2 (20명 중 2명을 뽑는 경우의 수)

따라서 완전탐색시 전체 연산 수는 2^8 20C2 8 8= 25610198*8 = 3,112,960 < 1억

즉, 완전탐색을 적용하여 문제를 접근할 수 있다.

재귀함수를 통해 특정 속성을 포함할지 말지 결정하여 subset 함수를 만든다.
해당 subset 함수를 활용하여 모든 튜플들이 고유하게 식별가능한지 확인한다.
1. 고유하게 식별가능할 경우, 후보키 결과리스트에 append하고, 이어서 하지 않고 다른 경우를 바로 탐색한다.
2. 식별불가능할 경우, 계속 진행한다.
후보키 결과리스트에서 포함관계를 가지는 조합 중 크기가 더 큰 조합을 제거한다.
최종적으로 후보키 결과리스트의 갯수를 counting 하여 return 한다.

코드 구현 [필수 작성]

첫번째 시도

# 2시 시작 -> 3시12분 종료(중도포기)

def solution(relation):
    answer = 0
    possible_relation = []
    
    row_size = len(relation)
    col_size = len(relation[0])

    curr_col = [False for _ in range(col_size)]
    
    def make_new_relation(col_info):
        nonlocal relation
        new_relation = []
        for row in range(row_size):
            new_row = []
            for col in range(col_size):
                if col_info[col]:
                    new_row.append(relation[row][col])
            new_relation.append(new_row)
        return new_relation
        
    def check_unique_relation(relation):
        for case_1_index in range(len(relation)):
            for case_2_index in range(case_1_index + 1,len(relation)):
                if relation[case_1_index] == relation[case_2_index]:
                    return False
        return True
    
    def check_contain(big_list, small_list):
        for elem in small_list:
            if elem not in big_list:
                return False
        return True
    
    def backtrack(start,col_info):
        nonlocal possible_relation
        
        new_relation = make_new_relation(col_info)
        #print(new_relation,"\n")
        #print(check_unique_relation(new_relation),"\n")
        if check_unique_relation(new_relation):
            possible_relation.append(new_relation)
            return
        for i in range(start,len(col_info)):
            curr_col[i] = True
            backtrack(i+1,curr_col)
            curr_col[i] = False
    
    def select_minimality(candidate_relations):
        pop_answer = []
        for case_1_index in range(len(candidate_relations)):
            for case_2_index in range(case_1_index + 1,len(candidate_relations)):
                if check_contain(candidate_relations[case_1_index],candidate_relations[case_2_index]): 
                    #만약 case_1 릴레이션이 case_2 릴레이션을 포함하면 -> case_1 > case_2
                    pop_answer.append(case_1_index)
                elif check_contain(candidate_relations[case_2_index],candidate_relations[case_1_index]): 
                    #만약 case_2 릴레이션이 case_1 릴레이션을 포함하면 -> case_2 > case_1
                    pop_answer.append(case_2_index)
        return pop_answer

    backtrack(0,curr_col)
    
    # for i in range(len(possible_relation)):
    #     print(possible_relation[i],"\n")

    pop_index = select_minimality(possible_relation)
    
    pop_index.sort()
    
    for i in range(len(pop_index)-1,-1,-1):
        possible_relation.pop(pop_index[i])
    # for i in range(len(possible_relation)):
    #     print(possible_relation[i],"\n")
    
    answer = len(possible_relation)
    
    # a = ['ryan', 'music']
    # b = ['ryan', 'music', '3']
    # test_relation = [a,b]
    # print(test_relation)
    # print(select_minimality(test_relation))
    # print(test_relation)
    
#     print(check_contain(a,b))
#     print(check_contain(b,a))
    
    return answer

corner case:

최소성 보장 문제

relation: [["a", "1"], ["a", "2"], ["b", "3"]]
return: 1

→ 코드가 relation 의 원소 포함 유무를 서로 비교해야하는데, 비교대상 페어링이 잘못되었음

    def check_contain(big_list, small_list):
        for elem in small_list:
            if elem not in big_list:
                return False
        return True

ex) ['1'] ↔ [['a', '1'], ['a', '2'], ['b', '3']]

하나는 relation 의 원소이고, 하나는 relation 자체를 비교하고 있어서 최소성 검사가 의도대로 이루어지지 않았음.

relation: [["kim", "1", "1"], ["park", "2", "2"], ["choi", "3", "3"]]
return: 3

    def select_minimality(candidate_relations):
        pop_answer = []
        for case_1_index in range(len(candidate_relations)):
            for case_2_index in range(case_1_index + 1,len(candidate_relations)):
                if check_contain(candidate_relations[case_1_index],candidate_relations[case_2_index]): 
                    #만약 case_1 릴레이션이 case_2 릴레이션을 포함하면 -> case_1 > case_2
                    pop_answer.append(case_1_index)
                elif check_contain(candidate_relations[case_2_index],candidate_relations[case_1_index]): 
                    #만약 case_2 릴레이션이 case_1 릴레이션을 포함하면 -> case_2 > case_1
                    pop_answer.append(case_2_index)
        return pop_answer

→ 서로 다른 속성에 대해서 실제 값이 같을 때, 최소성 검사에서 서로가 서로를 포함하기 때문에, 다른 속성에 대한 값임에도 불구하고 속성 조합 case를 지워버리는 문제가 발생.

→ 따라서, 값을 비교하는게 아닌, 속성의 조합을 저장한 후, 속성의 조합의 포함관계를 파악하는 식으로 코드를 바꿔야함.

두번째 시도

속성의 조합을 저장한 후, 속성의 조합의 포함관계를 파악하는 식으로 코드를 바꿔야함 → 비트곱으로 포함관계를 파악 후 최소성을 보장하지 않는 속성 조합을 pop_combo_index 에 넣어줬음

런타임 에러 발생

→ 어디선가 index 초과하는 탐색 있는듯

→ 코드 한번 정리하고 세번째 시도로 넘어감 (버전관리측면)

# 2시 시작 -> 3시12분 종료

def solution(relation):
    answer = 0
    possible_relation = []
    possible_attribute_combo = []
    
    row_size = len(relation)
    col_size = len(relation[0])

    curr_col = [False for _ in range(col_size)]
    
    def make_new_relation(col_info):
        
        new_relation = []
        
        for row in range(row_size):
            new_row = []
            for col in range(col_size):
                if col_info[col]:
                    new_row.append(relation[row][col])
            new_relation.append(new_row)
        return new_relation
        
    def check_unique_relation(relation):
        for case_1_index in range(len(relation)):
            for case_2_index in range(case_1_index + 1,len(relation)):
                if relation[case_1_index] == relation[case_2_index]:
                    return False
        return True
    
    def check_contain(big_list, small_list):
        print("==============================")
        print("big_list=",big_list)
        print("small_list=",small_list)
        for elem in small_list[0]:
            print(elem)
            if elem not in big_list[0]:
                return False
        return True
    
    def backtrack(start,col_info):
        nonlocal possible_relation
        print("============================")
        print("function call")
        print("col_info=",col_info)
        new_relation = make_new_relation(col_info)
        #print(new_relation,"\n")
        #print(check_unique_relation(new_relation),"\n")
        if check_unique_relation(new_relation):
            possible_relation.append(new_relation)
            possible_attribute_combo.append(col_info[:])
            print("유일성테스트에서 통과하여 return 합니다.")
            return
        for i in range(start,len(col_info)):
            curr_col[i] = True
            print(str(i),"번째 원소를 추가합니다.")
            backtrack(i+1,col_info)
            print(str(i),"번째 원소를 제거합니다.")
            curr_col[i] = False
    
    def select_minimality(candidate_relations):
        pop_answer = []
        for case_1_index in range(len(candidate_relations)):
            for case_2_index in range(case_1_index + 1,len(candidate_relations)):
                print("=================================")
                print("case_1_list=",candidate_relations[case_1_index])
                print("case_2_list=",candidate_relations[case_2_index])
                if check_contain(candidate_relations[case_1_index],candidate_relations[case_2_index]): 
                    #만약 case_1 릴레이션이 case_2 릴레이션을 포함하면 -> case_1 > case_2
                    pop_answer.append(case_1_index)
                elif check_contain(candidate_relations[case_2_index],candidate_relations[case_1_index]): 
                    #만약 case_2 릴레이션이 case_1 릴레이션을 포함하면 -> case_2 > case_1
                    pop_answer.append(case_2_index)
        return pop_answer
    
    def select_min_combo(possible_attribute_combo):
        pop_answer = []
        for case_1_index in range(len(possible_attribute_combo)):
            for case_2_index in range(case_1_index + 1,len(possible_attribute_combo)):
                print("=================================")
                print("case_1_list=",possible_attribute_combo[case_1_index])
                print("case_2_list=",possible_attribute_combo[case_2_index])
                bit_mult = []
                for i in range(col_size):
                    bit_mult.append(possible_attribute_combo[case_1_index][i]*possible_attribute_combo[case_2_index][i])
                if bit_mult == possible_attribute_combo[case_1_index]: 
                    #만약 비트곱이 case_1 속성조합과 같다면, case_1 이 유일성&최소성 만족
                    pop_answer.append(case_2_index)
                elif bit_mult == possible_attribute_combo[case_2_index]:
                    #만약 비트곱이 case_2 속성조합과 같다면, case_2 가 유일성&최소성 만족
                    pop_answer.append(case_1_index)
                else:
                    #만약 비트곱이 어느것과도 일치하지 않으면, 그냥 두개는 아예 서로 다른거
                    pass
        return pop_answer

    backtrack(0,curr_col) 
    #이제 possible_relation 에는 8개의 속성으로 만들 수 있는 유일성을 만족하는 모든 속성 조합이 들어가있다.
    
#     for i in range(len(possible_relation)):
#         print(possible_relation[i],"\n")
    print(possible_relation)
    print(possible_attribute_combo)
    pop_index = select_minimality(possible_relation)
    pop_combo_index = select_min_combo(possible_attribute_combo)
    #pop_index 에는 유일성은 만족하지만 최소성을 만족하지 못하는 속성조합의 possible_relation 에서의 index 가 들어있음
    print(pop_index)
    print(pop_combo_index)
    pop_index.sort()
    pop_combo_index.sort()
    
    for i in range(len(pop_index)-1,-1,-1):
        print(i)
        possible_relation.pop(pop_index[i])
        
    for i in range(len(pop_combo_index)-1,-1,-1):
        print(i)
        possible_attribute_combo.pop(pop_combo_index[i])
    
#     for i in range(len(possible_relation)):
#         print(possible_relation[i],"\n")
    
    #answer = len(possible_relation)
    
    answer = len(possible_attribute_combo)
    # a = ['ryan', 'music']
    # b = ['ryan', 'music', '3']
    # test_relation = [a,b]
    # print(test_relation)
    # print(select_minimality(test_relation))
    # print(test_relation)
    
#     print(check_contain(a,b))
#     print(check_contain(b,a))
    
    return answer

세번째 시도

아래 코드 주석 처리시 런타임에러 사라짐

아래 코드 어디선가 list index out of range 에러가 나는 것으로 추정

for i in range(len(pop_combo_index)-1,-1,-1):
        #print(i)
        possible_attribute_combo.pop(pop_combo_index[i])

오류 케이스:

[["a", "b", "1"], ["a", "b", "2"], ["b", "c", "3"]]

후보키가 될 수 없는 것들끼리 서로 포함관계를 이룰때, 이를 pop_combo_index 에 중복해서 추가하다 보니, 의도치 않은 결과를 낳음 → pop_combo_index 원소는 중복값이 있을 경우 제거하여 1개의 index 에 대해 1번만 pop 이 진행되게 만들어주어야 함.

[최종코드]

# 2시 시작 -> 3시12분 종료

def solution(relation):
    answer = 0
    possible_relation = []
    possible_attribute_combo = []
    
    row_size = len(relation)
    col_size = len(relation[0])

    curr_col = [False for _ in range(col_size)]
    
    def make_new_relation(col_info):
        
        new_relation = []
        
        for row in range(row_size):
            new_row = []
            for col in range(col_size):
                if col_info[col]:
                    new_row.append(relation[row][col])
            new_relation.append(new_row)
        return new_relation
        
    def check_unique_relation(relation):
        for case_1_index in range(len(relation)):
            for case_2_index in range(case_1_index + 1,len(relation)):
                if relation[case_1_index] == relation[case_2_index]:
                    return False
        return True
    
    def backtrack(start,col_info):
        nonlocal possible_relation
        #print("============================")
        #print("function call")
        #print("col_info=",col_info)
        new_relation = make_new_relation(col_info)
        #print(new_relation,"\n")
        #print(check_unique_relation(new_relation),"\n")
        if check_unique_relation(new_relation):
            possible_attribute_combo.append(col_info[:])
            #print("유일성테스트에서 통과하여 return 합니다.")
            return
        for i in range(start,len(col_info)):
            curr_col[i] = True
            #print(str(i),"번째 원소를 추가합니다.")
            backtrack(i+1,col_info)
            #print(str(i),"번째 원소를 제거합니다.")
            curr_col[i] = False
    
    def select_min_combo(possible_attribute_combo):
        pop_answer = []
        for case_1_index in range(len(possible_attribute_combo)):
            for case_2_index in range(case_1_index + 1,len(possible_attribute_combo)):
                #print("=================================")
                #print("case_1_list=",possible_attribute_combo[case_1_index])
                #print("case_2_list=",possible_attribute_combo[case_2_index])
                bit_mult = []
                for i in range(col_size):
                    bit_mult.append(possible_attribute_combo[case_1_index][i]*possible_attribute_combo[case_2_index][i])
                if bit_mult == possible_attribute_combo[case_1_index]: 
                    #만약 비트곱이 case_1 속성조합과 같다면, case_1 이 유일성&최소성 만족
                    pop_answer.append(case_2_index)
                elif bit_mult == possible_attribute_combo[case_2_index]:
                    #만약 비트곱이 case_2 속성조합과 같다면, case_2 가 유일성&최소성 만족
                    pop_answer.append(case_1_index)
                else:
                    #만약 비트곱이 어느것과도 일치하지 않으면, 그냥 두개는 아예 서로 다른거
                    pass
        return pop_answer

    backtrack(0,curr_col) 
    #이제 possible_attribute_combo 에는 8개의 속성으로 만들 수 있는 유일성을 만족하는 모든 속성 조합이 들어가있다.
    
#     for i in range(len(possible_relation)):
#         print(possible_relation[i],"\n")
    #print(possible_attribute_combo)
    pop_combo_index = select_min_combo(possible_attribute_combo)
    #pop_combo_index 에는 유일성은 만족하지만 최소성을 만족하지 못하는 속성조합의 possible_relation 에서의 index 가 들어있음
    #print(pop_combo_index)
    pop_combo_index.sort()
    pop_combo_index = list(set(pop_combo_index))
    print(list(set(pop_combo_index)))
    print(possible_attribute_combo)
        
    for i in range(len(pop_combo_index)-1,-1,-1):
        #print(i)
        possible_attribute_combo.pop(pop_combo_index[i])
    
#     for i in range(len(possible_relation)):
#         print(possible_relation[i],"\n")
    
    #answer = len(possible_relation)
    
    answer = len(possible_attribute_combo)
    # a = ['ryan', 'music']
    # b = ['ryan', 'music', '3']
    # test_relation = [a,b]
    # print(test_relation)
    # print(select_minimality(test_relation))
    # print(test_relation)
    
#     print(check_contain(a,b))
#     print(check_contain(b,a))
    
    return answer

업로드중..

배우게 된 점 [ 필수 X ]

Set 을 활용하면 탐색 시간 복잡도를 대폭 줄일 수 있다.

중복되지 않는 값을 저장할때 Set 을 활용하면 시간복잡도를 대폭 줄일 수 있는 이유

→ set 은 원소탐색에 O(1) 의 시간이 소요됨 → 그 이유는 set 은 원소 저장에 Hash Table 을 사용하기 때문임

해설코드의 비트 연산자와 시프트 연산자 활용법 참고하기

→ 해설코드에서는 내가 True,False 로 구현한 부분을 비트연산자와 시프트연산자로 가독성 좋은 코드로 구현했다. 또한 유일성 검사를 모든 원소들을 서로 일일히 비교하지 않고 set 을 활용해서 중복값이 있냐 없냐로 판단함으로써 시간복잡도를 획기적으로 줄였다.

다음에 풀때는 비트연산자,시프트연산자,set특징을 활용해서 풀어보도록 하자.

질문 [ 필수 X ]

Q1.

코드 가독성 똥망 이슈…

Q2.

select_minimality 함수 구현에서 실제 사용중인 list를 index 단위로 pop 해버리니 런타임 에러가 발생 → for 문 돌다가 pop 된 index 때문에 out of range 에러가 발생하는 것으로 추정

Q3.

for i in range(len(pop_index)-1,-1,-1): → pop을 index 기준 역순으로 하지 않으면 의도한대로 pop 되지 않는 이슈 발생. 그래서 역순으로 pop 해주었으나, range 함수 가독성 똥망 이슈 → 어떻게 개선하면 좋을지

Q4. → 질문 보류 (추가적인 생각 중)

따라서 완전탐색시 전체 연산 수는 2^8 20C2 8 8= 25610198*8 = 3,112,960 < 1억 ← 제 생각은 이런데

강의 타임라인 06:50 에서는 2^8 20 8 * 8 로 설명했더라구요. 모든 튜플을 각각 서로 비교해야되기 때문에 20C2 가 맞는것 같은데 제가 혹시 잘못 생각하고 있는걸까요?

Q5.

if check_unique_relation(new_relation):
            possible_relation.append(new_relation)
            return

가장 적은 col 종류로 조합을 만들어서 유일성이 확인되었을 때, 더 이어서 추가하는 경우를 배제했는데, 왜 추가적으로 최소성 테스트를 해주어야하는가?

Q6. (추가질문)

문제풀이시 코너케이스가 확인되지 않을때, 코너케이스를 생각하는 팁

제가 생각하는 팁은 제한 사항을 참고하여 케이스가 굉장히 minimum 일때와 maximum 일때를 추가하는데, 이를 통해서도 코너케이스를 찾을 수 없을때 코치님은 어떤 방식을 활용하는지 코치님만의 코너케이스 찾기 팁이 있는지 궁금합니다.

피드백 [ 코치 작성 ]

A1

코드의 가독성은 palindrome partitioning에 답변드린 간결한 코드에 관한 내용과 같이 반복적인 숙달밖에 정답이 없습니다. 가독성을 높이기 위해서는 중복되는 코드를 함수로 분리해 모듈화 하는 것과 같은 방법들이 존재하지만 코딩 테스트는 협업을 위한 코드가 아닌 개인이 작성하는 코드이기에 가독성이 떨어지더라도 본인의 논리만 확고하다면 큰 상관이 없습니다.(불편하더라도 작성을 완료하고 수정하는 것이 더 빠른 경우가 많습니다.)

A2 & A3

2번과 3번 질문은 말씀해주신 것과 같이 배열을 순회하는 중 배열이 수정되어 발생한 현상입니다. 이를 해결하기 위한 방법으로 배열을 역으로 순회하는 방법을 사용하셨는데 이 또한 좋은 방법입니다. 다른 방법으로는 배열을 순회하는 과정 중 i를 수정하는 것입니다. 이 때 주의해야 할 점으로는 파이썬의 for i in range()와 같은 문법은 range()를 통해 반환받은 iterable 객체를 순회하는 방식으로 작동하기에 for문 내부에서 i를 수정해도 영향을 미치지 못합니다. 그렇기에 while문을 사용하는 것과 같은 방법으로 작성할 수 있습니다.

또는 삭제해야 할 요소를 미리 기록하고 for문이 종료된 이후 한번에 삭제하는 방법도 가능합니다.

A4

강의에서 사용한 코드에서는 유일성을 검사하기 위해 배열에 담은 속성 튜플들과 set으로 변형한 속성 튜플들의 길이를 비교했습니다. python은 set을 사용할 때 해시를 사용하며 이는 O(1)의 탐색 속도를 보장합니다. 그렇기에 20개의 튜플을 비교하더라도 각각 2개씩 비교하는 것이 아닌 set에 20개의 튜플을 담고 길이를 비교하는 것이기에 20으로 설명드렸습니다.

A5

최소성은 한개의 속성이 한개의 후보키에 사용되었다고 해서 다른 후보키에서 사용을 못하는 것이 아닌 후보키로서 사용 가능한 최소한의 속성을 사용하는 성질입니다. 질문주신 것과 같이 한 개의 속성이 후보키에 사용되어 이후 후보키 선정에서 제외된다면 (1,2), (1,3)이 후보키로 선정될 수 있을 때 1이 배제되어 (1,3)을 후보키로 선정하지 못하는 문제가 발생합니다.

→ (1,3)이 후보키가 될 수 없는 이유는 (1) 로만 구성된 후보키가 무조건 (1,3)을 최소성 테스트에서 이기기(?) 때문 아닌가요? (즉, (1)로만으로도 유일하게 식별가능하면 굳이 (1,3) 의 속성 조합을 포함하지 않아도 된다는 의미) 제가 뭔가 말을 잘못이해하고 있는건가요?

A6

테스트 케이스는 주로 코너 케이스를 고려하게 됩니다. 하지만 문제의 코너 케이스를 즉시 알아차리는 것은 힘듭니다. 저의 경우 앞서 설명드린 디버깅을 통해 각각의 함수가 제가 원하는 기능대로 적절히 흘러가고 있는지, 각 함수가 서로 충돌하지 않는지, 각 테스트 케이스가 제가 의도한 흐름대로 돌아가고 있는지를 먼저 파악합니다. 이 과정에서 잘못 흘러가는 요소를 발견하여 테스트 케이스를 추가하는 경우가 종종 있습니다.
이후 문제를 보며 제가 문제를 제대로 파악한게 맞는지, 문제 조건 중 놓친 조건이 있는지를 다시 파악합니다. 저 또한 후보키 문제를 처음 접했을 때 후보키 조합을 인덱스가 아닌 값으로 설정하여 속성은 다르지만 값이 같은 경우를 파악하지 못했고 문제 조건을 확인하는 중 해당 가능성을 떠올리고 수정하게 되었습니다.

대부분의 코너 케이스는 문제를 제대로 파악하는 과정을 통해 찾을 수 있습니다. 하지만 문제를 푸는 과정에서 바로 코너 케이스를 찾기는 힘들며 그렇기에 다시 한 번 디버깅의 중요성을 강조드리고 싶습니다.

KissNode

어제보다 더, 내일보다 덜.

이전 포스트

[노씨데브 킬링캠프] 1주차 - 문제풀이: ★메뉴 리뉴얼★

다음 포스트

[노씨데브 킬링캠프] 1주차 - 문제풀이: ★후보키★