뉴스 클러스터링

boooookreeeed·2021년 11월 1일

코딩테스트

목록 보기

2/10

문제 링크 :
https://programmers.co.kr/learn/courses/30/lessons/17677

내가 생각한 문제 풀이

[시도1]
1) 두글자씩 나누기
2) 문자열 정비

미해당하는 문자 탈락
대문자로 만들기
3) 합집합과 교집합
set 이용 ? 중복을 허용하기 때문에 불가능
dict로 각자 조사 -> 겹치는 것

시간 / 결과 / 패착

소요 시간 : 23분
결과(정확도) : 100/100

쉬웠다.. 당연함 레벨이 낮으니까

문제 해설

맞추긴 했는데 혹시 더 효율적인 코드가 있는지 확인해보려고 한다
Counter를 사용할 수 있다.

코드

# 23분
from collections import defaultdict
import math
def cutstr(str):
    strdict = defaultdict(int)
    for i in range(len(str)-1):
        temp_a = str[i]
        temp_b = str[i+1]

        if not temp_a.isalpha() or not temp_b.isalpha(): #알파벳 아니면
            continue
        else:
            if temp_a.islower():
                temp_a = temp_a.upper()
            if temp_b.islower():
                temp_b = temp_b.upper()
        temp = temp_a+temp_b
        strdict[temp] += 1
    return strdict


def solution(str1, str2):
    dictstr1 = cutstr(str1)
    dictstr2 = cutstr(str2)

    if len(dictstr1) + len(dictstr2) == 0:
        return 1 * 65536

    intersec = set(dictstr1).intersection(set(dictstr2))

    intercount = 0
    unioncount = 0

    for i in intersec:
        intercount += min(dictstr1[i], dictstr2[i])
        unioncount += max(dictstr1[i], dictstr2[i])

    for s1 in dictstr1:
        if s1 in intersec:
            continue
        else:
            unioncount += dictstr1[s1]

    for s2 in dictstr2:
        if s2 in intersec:
            continue
        else:
            unioncount += dictstr2[s2]

    return math.floor((intercount/unioncount)*65536)

추가 공부

Counter 모듈

from collections import Counter
test = ["hi", "hello", "hello", "bye"]
Counter(test)

출력 : Counter({'hello': 2, 'hi': 1, 'bye': 1})

Counter(test).most_common() # 데이터의 개수가 많은 순으로 정렬된 배열 리턴
Counter(test).most_common(1) # 가장 개수가 많은 k개의 데이터 리턴

tip for me

이 문제에서 합집합이 0인 경우에는 자카드 유사도를 1로 반환하고, 문제 마지막에 65536을 곱하므로 return 65536을 해야한다.
테스트케이스에 있었기 때문에 이걸 고려하지 않은 걸 알수 있었지만 만약에 테스트케이스가 불친절한 코딩테스트를 보게된다면 ? ? ㅜㅜ
그러니까 이런 특이 케이스들은 보통 문제 풀면서 메모를 한다. 문제 다풀고 메모의 특이조건까지 다 만족했는지 스스로 확인할 것

boooookreeeed

you can do

이전 포스트

다단계 칫솔 판매

다음 포스트