221001_HackerRank : Tree - Huffman Decoding

Cswยท2022๋…„ 10์›” 1์ผ
0

CodingTest

๋ชฉ๋ก ๋ณด๊ธฐ
2/6

๐ŸŒž Tree: Huffman Decoding

๐ŸšŠ ๋ฌธ์ œ ์„ค๋ช…

  • Huffman coding ์€ ์ฃผํŒŒ์ˆ˜์— ๋”ฐ๋ผ ๊ฐ€๋ณ€ ๊ธธ์ด์˜ ๋ถ€ํ˜ธ์–ด๋“ค์„ ๊ณ ์ • ๊ธธ์ด ์ž…๋ ฅ ๋ฌธ์ž์— ํ• ๋‹นํ•œ๋‹ค.
  • ๋นˆ๋„๊ฐ€ ๋†’์€ ๋ฌธ์ž์ผ์ˆ˜๋ก ์งง์€ ๋ถ€ํ˜ธ์–ด๋“ค์ด ํ• ๋‹น๋˜๊ณ  ๋นˆ๋„๊ฐ€ ๋‚ฎ์€ ๋ฌธ์ž์ผ์ˆ˜๋ก ๊ธด ๋ถ€ํ˜ธ์–ด๋“ค์ด ํ• ๋‹น๋ฉ๋‹ˆ๋‹ค.
  • ๋ฌธ์ž์˜ ๊ฒฝ๋กœ๋ฅผ ๋”ฐ๋ผ ์žˆ๋Š” ๋ชจ๋“  ๊ฐ€์žฅ์ž๋ฆฌ์—๋Š” ์ฝ”๋“œ ๋ฒˆํ˜ธ๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
  • ํŠธ๋ฆฌ์˜ ์™ผ์ชฝ์— ์žˆ์œผ๋ฉด 0์ด ๋ฉ๋‹ˆ๋‹ค. ์˜ค๋ฅธ์ชฝ์— ์žˆ์œผ๋ฉด 1์ด ๋ฉ๋‹ˆ๋‹ค.
  • ์˜ค์ง leaf node๋งŒ์ด ๊ธ€์ž์™€ ๊ทธ ๋นˆ๋„์ˆ˜๋ฅผ ํฌํ•จํ•  ๊ฒƒ์ด๋‹ค. ๋‹ค๋ฅธ ๋ชจ๋“  node๋Š” ๋ฌธ์ž ๋Œ€์‹  null์„ ํฌํ•จํ•˜๋ฉฐ, ๋ชจ๋“  ๋…ธ๋“œ์™€ ๊ทธ ํ•˜์œ„ ๋ฌธ์ž์˜ ๋นˆ๋„ ์ˆ˜๋ฅผ ํฌํ•จํ•œ๋‹ค.
  • ์˜ˆ๋ฅผ ๋“ค์–ด ๋ฌธ์ž์—ด ABRACADABRA๋ฅผ ์ƒ๊ฐํ•ด ๋ณด์ž.
  • ๋ฌธ์ž์—ด์—๋Š” ์ด 11๊ฐœ์˜ ๋ฌธ์ž๊ฐ€ ์žˆ๊ณ , ์ด ์ˆซ์ž๋Š” ์ตœ์ข…์ ์œผ๋กœ ๊ฒฐ์ •๋œ ํŠธ๋ฆฌ์˜ ๋ฃจํŠธ์— ์žˆ๋Š” ์ˆ˜์™€ ์ผ์น˜ํ•ด์•ผ ํ•œ๋‹ค.
  • ์šฐ๋ฆฌ์˜ ์ฃผํŒŒ์ˆ˜๋Š” A=5, B=2, R=2, C=1, B=1 ์ด๋‹ค.
  • ๊ฐ€์žฅ ์ž‘์€ ๋‘ ๊ฐœ์˜ ์ฃผํŒŒ์ˆ˜๋Š” C์™€ D์— ํ•ด๋‹นํ•˜๋ฏ€๋กœ ๋‘˜ ๋‹ค 1๋กœ ๊ฐ’์ด ๊ฐ™์œผ๋ฏ€๋กœ ์ด ๋‘ ์ฃผํŒŒ์ˆ˜๋กœ ํŠธ๋ฆฌ๋ฅผ ๋งŒ๋“ ๋‹ค.
  • ๋ฃจํŠธ ๋…ธ๋“œ๋Š” ํ•˜์œ„ ๋…ธ๋“œ ์นด์šดํŠธ์˜ ํ•ฉ๊ณ„(์ด ๊ฒฝ์šฐ์—๋Š” 1+1 =2)๋ฅผ ํฌํ•จํ•œ๋‹ค.
  • ์™ผ์ชฝ ๋…ธ๋“œ๋Š” ์ฒ˜์Œ ์ ‘ํ•˜๋Š” ๋ฌธ์ž C๋ฅผ, ์˜ค๋ฅธ์ชฝ ๋…ธ๋“œ๋Š” D๋ฅผ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค.
  • ๋‹ค์Œ์œผ๋กœ ์šฐ๋ฆฌ๋Š” ๋ฐฉ๊ธˆ ๋งŒ๋“  ํŠธ๋ฆฌ, ๋ฌธ์ž B, ๋ฌธ์ž R์˜ 3๊ฐ€์ง€ ํ•ญ๋ชฉ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค.
  • ํŠธ๋ฆฌ๊ฐ€ ๋จผ์ € ์™”๊ธฐ ๋•Œ๋ฌธ์— ์ƒˆ ๋ฃจํŠธ ๋…ธ๋“œ์˜ ์™ผ์ชฝ์— ์žˆ๊ณ , B๋Š” ์˜ค๋ฅธ์ชฝ์œผ๋กœ ๊ฐ€๊ฒŒ ๋œ๋‹ค.
  • ํŠธ๋ฆฌ๊ฐ€ ์™„๋ฃŒ๋  ๋•Œ๊นŒ์ง€ ๋ฐ˜๋ณตํ•œ ๋‹ค์Œ, ๊ฐ€์žฅ์ž๋ฆฌ์— 1๊ณผ 0์„ ์ž…๋ ฅ.
  • ์™„์„ฑ๋œ ๊ทธ๋ž˜ํ”„๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

  • ์ž…๋ ฅ๋œ ๋ฌธ์ž๋Š” ์˜ค์ง leaf node์—๋งŒ ์žˆ๋‹ค.
  • ๋‚ด๋ถ€ ๋…ธ๋“œ์˜ ๋ฌธ์ž ๊ฐ’์€ Null์„ ์˜๋ฏธํ•œ๋‹ค.
  • ์šฐ๋ฆฌ๋Š” ์บ๋ฆญํ„ฐ์— ๋Œ€ํ•œ ์šฐ๋ฆฌ์˜ ๊ฐ’์ด ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.
      A - 0
      B - 111
      C - 1100
      D - 1101
      R - 10
  • Huffman encoding๋œ ๋ฌธ์ž์—ด์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.
      A B    R  A C     A D     A B    R  A
      0 111 10 0 1100 0 1101 0 111 10 0
      or
      01111001100011010111100
  • ๋ชจํ˜ธ์„ฑ์„ ํ”ผํ•˜๊ธฐ ์œ„ํ•ด Huffman encoding์€ ์ ‘๋‘์‚ฌ ์—†๋Š” ์ธ์ฝ”๋”ฉ ๊ธฐ๋ฒ•์ด๋‹ค. ๋ถ€ํ˜ธ์–ด๋Š” ๋‹ค๋ฅธ ๋ถ€ํ˜ธ์–ด์˜ ์ ‘๋‘์‚ฌ๋กœ ๋‚˜ํƒ€๋‚˜์ง€ ์•Š๋Š”๋‹ค
  • ์ธ์ฝ”๋”ฉ๋œ ๋ฌธ์ž์—ด์„ ๋””์ฝ”๋”ฉํ•˜๋ ค๋ฉด 0๊ณผ 1์„ ๋”ฐ๋ผ leaf node๊นŒ์ง€ ์ด๋™ํ•œ ํ›„ ๋ฌธ์ž๋ฅผ ๋ฐ˜ํ™˜ํ•œ๋‹ค.
  • Huffman tree์˜ ๋ฃจํŠธ์— ๋Œ€ํ•œ ํฌ์ธํ„ฐ์™€ ๋””์ฝ”๋”ฉํ•  ์ด์ง„ ์ฝ”๋“œํ™”๋œ ๋ฌธ์ž์—ด์ด ์ œ๊ณต๋˜๋ฉฐ, ๋””์ฝ”๋”ฉ๋œ ๋ฌธ์ž์—ด์„ ์ถœ๋ ฅํ•ด์•ผ ํ•œ๋‹ค.

์ž…๋ ฅ ์˜ˆ์‹œ

s="1001011"

์ถœ๋ ฅ ์˜ˆ์‹œ

ABACA

์„ค๋ช…

S="1001011"
Processing the string from left to right.
S[0]='1' : we move to the right child of the root. We encounter a leaf node with value 'A'. We add 'A' to the decoded string.
We move back to the root.

S[1]='0' : we move to the left child. 
S[2]='0' : we move to the left child. We encounter a leaf node with value 'B'. We add 'B' to the decoded string.
We move back to the root.

S[3] = '1' : we move to the right child of the root. We encounter a leaf node with value 'A'. We add 'A' to the decoded string.
We move back to the root.

S[4]='0' : we move to the left child. 
S[5]='1' : we move to the right child. We encounter a leaf node with value C'. We add 'C' to the decoded string.
We move back to the root.

 S[6] = '1' : we move to the right child of the root. We encounter a leaf node with value 'A'. We add 'A' to the decoded string.
We move back to the root.

Decoded String = "ABACA"

๐Ÿšข ๊ธฐ๋ณธ ์ œ๊ณต Code

import queue as Queue

cntr = 0

class Node:
    def __init__(self, freq, data):
        self.freq = freq
        self.data = data
        self.left = None
        self.right = None
        global cntr
        self._count = cntr
        cntr = cntr + 1
    
    # lt (less than) : ๋‘ ๊ฐ์ฒด์˜ ๊ฐ’์˜ ํฌ๊ธฐ๋ฅผ ๋น„๊ตํ•  ๋•Œ ์‚ฌ์šฉ๋˜๋Š” ํ•จ์ˆ˜
    def __lt__(self, other):
        if self.freq != other.freq:
            return self.freq < other.freq
        return self._count < other._count

def huffman_hidden():#builds the tree and returns root
    q = Queue.PriorityQueue()

    
    for key in freq:
        q.put((freq[key], key, Node(freq[key], key) ))
    
    while q.qsize() != 1:
        a = q.get()
        b = q.get()
        obj = Node(a[0] + b[0], '\0' )
        obj.left = a[2]
        obj.right = b[2]
        q.put((obj.freq, obj.data, obj ))
        
    root = q.get()
    root = root[2]#contains root object
    return root

def dfs_hidden(obj, already):
    if(obj == None):
        return
    elif(obj.data != '\0'):
        code_hidden[obj.data] = already
        
    dfs_hidden(obj.right, already + "1")
    dfs_hidden(obj.left, already + "0")

๐Ÿšข Input & Output

ip = input()
freq = {}#maps each character to its frequency

cntr = 0

for ch in ip:
    if(freq.get(ch) == None):
        freq[ch] = 1
    else:
        freq[ch]+=1

root = huffman_hidden()#contains root of huffman tree

code_hidden = {}#contains code for each object

dfs_hidden(root, "")

if len(code_hidden) == 1:#if there is only one character in the i/p
    for key in code_hidden:
        code_hidden[key] = "0"

toBeDecoded = ""

for ch in ip:
   toBeDecoded += code_hidden[ch]

decodeHuff(root, toBeDecoded) # ์šฐ๋ฆฌ๊ฐ€ ๊ตฌํ˜„ํ•  ํ•จ์ˆ˜๋ฅผ ์—ฌ๊ธฐ์„œ ์‹คํ–‰

๐Ÿ“‘ Code

def decodeHuff(root, s):
	# ํ˜„์žฌ ๋…ธ๋“œ๋ฅผ root๋กœ ์ง€์ •
    current_node = root
    # ๋…ธ๋“œ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์Œ“์„ ๋นˆ ๋ฌธ์ž์—ด์„ ์„ ์–ธ
    result = ''
    # s์˜ ์ˆซ์ž๋“ค์„ ์•ž์—์„œ๋ถ€ํ„ฐ ํ™•์ธํ•˜๋Š” ์ž‘์—…์„ ๋ฐ˜๋ณต
    for num in s:
    	# ํ•ด๋‹น ์ˆซ์ž๊ฐ’์ด 0์ด๋ผ๋ฉด
        if int(num) == 0:
        	# ํƒ์ƒ‰ํ•  ๋…ธ๋“œ์˜ ์œ„์น˜๋ฅผ ํ˜„์žฌ ๋…ธ๋“œ์˜ ์™ผ์ชฝ ๋…ธ๋“œ๋กœ ๋ณ€๊ฒฝ
            current_node = current_node.left
        # ํ•ด๋‹น ์ˆซ์ž๊ฐ’์ด 1์ด๋ผ๋ฉด
        else:
        	# ํƒ์ƒ‰ํ•  ๋…ธ๋“œ์˜ ์œ„์น˜๋ฅผ ํ˜„์žฌ ๋…ธ๋“œ์˜ ์˜ค๋ฅธ์ชฝ ๋…ธ๋“œ๋กœ ๋ณ€๊ฒฝ
            current_node = current_node.right
        # ๋งŒ์•ฝ ํ˜„์žฌ ๋…ธ๋“œ์˜ ์™ผ์ชฝ ๋…ธ์™€ ์˜ค๋ฅธ์ชฝ ๋…ธ๋“œ๊ฐ€ ๋ชจ๋‘ ์—†๋‹ค๋ฉด
        if current_node.left == None and current_node.right == None:
        	# ํ˜„์žฌ ๋…ธ๋“œ์˜ ๋ฐ์ดํ„ฐ๋ฅผ result์— ์ถ”๊ฐ€
            result += current_node.data
            # ํ˜„์žฌ ๋…ธ๋“œ๋ฅผ ๋‹ค์‹œ root๋กœ ์ง€์ •
            # ๋ฐ˜๋ณต๋ฌธ์„ ๋Œ ๋•Œ, ํ•ญ์ƒ root ๋…ธ๋“œ๋ถ€ํ„ฐ ํƒ์ƒ‰์„ ์‹œ์ž‘ํ•˜๊ธฐ ์œ„ํ•จ.
            current = root
            
    print(result)

0๊ฐœ์˜ ๋Œ“๊ธ€