TIL103. Algorithm : Hash Tables

ID์งฑ์žฌยท2021๋…„ 12์›” 29์ผ
0

Algorithm

๋ชฉ๋ก ๋ณด๊ธฐ
19/20
post-thumbnail

๐Ÿ“Œ ์ด ํฌ์ŠคํŒ…์—์„œ๋Š” Hash Tables์— ๋Œ€ํ•ด ์ •๋ฆฌํ•˜์˜€์Šต๋‹ˆ๋‹ค.



๐ŸŒˆ Hash Tables

๐Ÿ”ฅ Hash Tables ์ด๋ž€?

๐Ÿ”ฅ Hash Tables vs Arrays

๐Ÿ”ฅ Hash Collisions



1. Hash Tables ์ด๋ž€?

๐Ÿค” ์‚ฌ์ „์ด ์•ŒํŒŒ๋ฒณ ์ˆœ์œผ๋กœ ๋˜์–ด ์žˆ์ง€ ์•Š๋‹ค๋ฉด ์–ด๋–ป๊ฒŒ๋ ๊นŒ?

โœ”๏ธ ์‚ฌ์ „์ด ์•ŒํŒŒ๋ฒณ ์ˆœ์œผ๋กœ ๋˜์–ด ์žˆ์ง€ ์•Š๋‹ค๋ฉด, ๋‚ด๊ฐ€ ์›ํ•˜๋Š” ๋‹จ์–ด์˜ ๋œป์„ ์•Œ๊ธฐ ์œ„ํ•ด ์ฒ˜์Œ๋ถ€ํ„ฐ ์ˆœ์ฐจ์ ์œผ๋กœ ํ™•์ธํ•ด์•ผ ํ•œ๋‹ค. ํ•˜๋‚˜ํ•˜๋‚˜ ๋ชจ๋‘ ํ™•์ธํ•ด์•ผํ•˜๋Š” ์ด ๊ณผ์ •์€ ๋‹จ์ˆœ ํƒ์ƒ‰๊ณผ ๊ฐ™๊ณ , ์‹œ๊ฐ„ ๋ณต์žก๋„๋Š” O(N)์ด๋‹ค.

โœ”๏ธ ๋‹คํ–‰์Šค๋Ÿฝ๊ฒŒ๋„ ์‚ฌ์ „์€ ์•ŒํŒŒ๋ฒณ์ˆœ์œผ๋กœ ๋˜์–ด์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด์ง„ ํƒ์ƒ‰์ด ๊ฐ€๋Šฅํ•˜๋‹ค. ์ด ๊ฒฝ์šฐ์— ์‹œ๊ฐ„ ๋ณต์žก๋„๋Š” O(logN)์ด๋‹ค.

โœ”๏ธ ์ด๋ฒˆ์—๋Š” ์‚ฌ์ „์ด ์•„๋‹ˆ๋ผ, ๋งˆํŠธ์—์„œ ๊ณ„์‚ฐ์›์„ ํ•˜๊ณ  ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ด๋ณด์ž. ๋ฌผํ’ˆ๋ณ„๋กœ ๊ฐ€๊ฒฉ์ด ์ ํ˜€์žˆ๋Š” ์žฅ๋ถ€๊ฐ€ ์žˆ์„ ๋•Œ, ๋‹จ์ˆœ ํƒ์ƒ‰์€ O(N)์ด๊ณ , ์ด์ง„ํƒ์ƒ‰์„ํ•˜๋ฉด O(logN)์˜ ์‹œ๊ฐ„๋ณต์žก๋„๊ฐ€ ํ•„์š”ํ•˜๋‹ค.

โœ”๏ธ 1์ดˆ์— ํ’ˆ๋ชฉ 10๊ฐœ๋ฅผ ์ฝ์„ ์ˆ˜ ์žˆ๋‹ค๊ณ  ๊ฐ€์ •ํ•˜๋ฉด, ํŒ๋งค์ค‘์ธ ๋ฌผ๊ฑด 1๊ฐœ์˜ ๊ฐ’์„ ์•Œ์•„๋‚ด๋Š”๋ฐ ์•„๋ž˜์™€ ๊ฐ™์€ ์‹œ๊ฐ„์ด ํ•„์š”ํ•˜๋‹ค.

โœ”๏ธ ๋”ฐ๋ผ์„œ O(logN)์˜ ์‹œ๊ฐ„ ๋ณต์žก๋„๋ฅผ ๊ฐ€์ง€๋Š” ์ด์ง„ ํƒ์ƒ‰ ๋˜ํ•œ ์ด๋Ÿฌํ•œ ์ƒํ™ฉ์—์„œ๋Š” ํšจ์œจ์ ์ธ ๋ฐฉ์‹์ด ์•„๋‹ˆ๋‹ค. 10000๊ฐœ์˜ ํ’ˆ๋ชฉ์„ ํŒ๋งค ์ค‘์ด๋ผ๋ฉด ๊ณ„์‚ฐ์›์ด 1๊ฐœ์˜ ๋ฌผํ’ˆ์˜ ๊ฐ€๊ฒฉ๋งŒ์„ ํ™•์ธํ•˜๋Š”๋ฐ์—๋„ ์ตœ๋Œ€ 2์ดˆ์˜ ์‹œ๊ฐ„์„ ์†Œ์š”๋˜๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.

โœ”๏ธ ์ด์— ๋” ํšจ์œจ์ ์ธ ๋ฐฉ์‹์˜ ์ž๋ฃŒ๊ตฌ์กฐ๊ฐ€ ํ•„์š”ํ•˜๋‹ค. ์ด๋ฅผ ๊ฐ€๋Šฅํ•˜๊ฒŒํ•˜๋Š” ๊ฒƒ์ด ํ•ด์‰ฌ๋‹ค. ํ•ด์‰ฌ๊ฐ€ ์–ผ๋งˆ๋‚˜ ๋น ๋ฅธ์ง€๋Š” ๋’ค์—์„œ ์‚ดํŽด๋ณด๊ฒ ๋‹ค.

๐Ÿค” Hash Tables์˜ ํŠน์ง•

โœ”๏ธ ํ•ด์‰ฌ ํ…Œ์ด๋ธ”์€ ํ‚ค(key)์— ๊ฐ’์„(value)๋ฅผ ์ €์žฅํ•˜๋Š” ๋ฐ์ดํ„ฐ ๊ตฌ์กฐ๋กœ ํ‚ค(key) ๊ฐ’์„ ๋งˆ์น˜ index์ฒ˜๋Ÿผ ์ƒ์šฉํ•˜์—ฌ ๊ฐ’(value)์„ ์ฝ์„ ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๋น ๋ฅด๋‹ค.

โœ”๏ธ ์ด๋ฏธ python์—์„œ๋Š” dictionary๋ผ๋Š” ๋ฐ์ดํ„ฐ ํƒ€์ž…์ด ์ด๋ฅผ ์ง€์›ํ•˜๊ณ  ์žˆ๋‹ค. ์ฆ‰, python์—์„œ๋Š” ํ•ด์‰ฌ ํ…Œ์ด๋ธ”์„ ๋ณ„๋„ ๊ตฌํ˜„ํ•  ํ•„์š” ์—†์ด ๋”•์…”๋„ˆ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ณ , ๋”•์…”๋„ˆ๋ฆฌ ๋‚ด๋ถ€์ ์œผ๋กœ ํ•ด์‰ฌ ํ…Œ์ด๋ธ”์˜ ๊ตฌ์กฐ๋ฅผ ๊ฐ–๊ณ  ์žˆ๋‹ค.

โœ”๏ธ ํ•ด์‰ฌ์˜ ๋Œ€ํ‘œ์ ์ธ ์žฅ์ ์œผ๋กœ๋Š” ๋ฐ์ดํ„ฐ๋ฅผ Create/Read ํ•˜๋Š” ์†๋„๊ฐ€ ๋น ๋ฅด๊ณ , ์ด๋ฏธ Key๊ฐ€ ์กด์žฌํ•˜๋Š”์ง€ ์ค‘๋ณต์„ ์ฒ˜๋ฆฌํ•˜๊ธฐ๊ฐ€ ์‰ฝ๋‹ค.

โœ”๏ธ ๋‹จ์ ์œผ๋กœ๋Š” ์ถฉ๋ถ„ํ•œ ์ €์žฅ ๊ณต๊ฐ„์„ ํ™•๋ณดํ•ด๋‘์–ด์•ผํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋น„๊ต์  ์ €์žฅ ๊ณต๊ฐ„์ด ๋งŽ์ด ํ•„์š”ํ•˜๊ณ , ์—ฌ๋Ÿฌ Key์— ํ•ด๋‹นํ•˜๋Š” ์ฃผ์†Œ๊ฐ€ ๋™์ผํ•  ๊ฒฝ์šฐ ์ถฉ๋Œ์„ ๋ง‰๊ธฐ ์œ„ํ•œ ๋ฐฉํŽธ์ด ํ•„์š”ํ•˜๋‹ค. ์ด๋Š” ์ถ”ํ›„ ๋‹ค๋ค„๋ณด๊ฒ ๋‹ค.

โœ”๏ธ ์ด์— ํ•ด์‰ฌ ํ…Œ์ด๋ธ”์˜ ์ฃผ ์šฉ๋„๋Š” CRUD๊ฐ€ ๋นˆ๋ฒˆํ•œ ๊ฒฝ์šฐ, ๊ฒ€์ƒ‰์ด ์ž์ฃผ ์ผ์–ด๋‚˜๋Š” ๊ฒฝ์šฐ, ์บ์‰ฌ๋ฅผ ๊ตฌํ˜„ํ•  ๋•Œ ํ™œ์šฉ๋œ๋‹ค.

โœ”๏ธ ์˜ˆ๋ฅผ ๋“ค์–ด, ์Šค๋งˆํŠธํฐ์˜ ์ฃผ์†Œ๋ก์€ ํ•ด์‰ฌ๋กœ ๊ฐœ๋…๊ณผ ๋™์ผํ•˜๋‹ค. ์‚ฌ๋žŒ์„ ์ฃผ์†Œ๋ก์—์„œ ๊ฒ€์ƒ‰ํ•˜๋ฉด, ์ „ํ™”๋ฒˆํ˜ธ, ์ฃผ์†Œ, email ๋“ฑ์„ ๋ชจ๋‘ ๋น ๋ฅด๊ฒŒ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

โœ”๏ธ ๋ฟ๋งŒ์•„๋‹ˆ๋ผ ์•ž์—์„œ ์ •๋ฆฌํ•œ Domain๊ณผ IP Address์˜ ๊ด€๊ณ„๋„ ์ด๋Ÿฌํ•œ ํ•ด์‰ฌ๋กœ ์ด๋ค„์ ธ์žˆ๋‹ค.

๐Ÿค” ํ•ด์‰ฌ ํ…Œ์ด๋ธ” ๊ด€๋ จ ์šฉ์–ด

โœ”๏ธ hash : ์ž„์˜์— ๊ฐ’์„ ๊ณ ์ • ๊ธธ์ด๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๊ฒƒ

โœ”๏ธ hash table : ํ‚ค ๊ฐ’์˜ ์—ฐ์‚ฐ์— ์˜ํ•ด ์ง์ ‘ ์ ‘๊ทผ์ด ๊ฐ€๋Šฅํ•œ ๋ฐ์ดํ„ฐ ๊ตฌ์กฐ

โœ”๏ธ hash function : Key์— ๋Œ€ํ•ด ์‚ฐ์ˆ  ์—ฐ์‚ฐ์„ ์ ์šฉํ•ด ํ•ด์‰ฌ ๊ฐ’์œผ๋กœ ๋ณ€ํ™˜ํ•ด์ฃผ๋Š” ํ•จ์ˆ˜

โœ”๏ธ hash value : Key๋ฅผ ํ•ด์‹ฑ ํ•จ์ˆ˜์— ๋Œ€์ž…ํ•˜์—ฌ ๋ฐ˜ํ™˜๋˜๋Š” ๊ฒƒ์„ ํ•ด์‰ฌ ๊ฐ’์ด๋ผํ•˜๊ณ , ์ด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ…Œ์ด๋ธ”์—์„œ ํ•ด๋‹น Key์— ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ ์œ„์น˜๋ฅผ ํƒ์ƒ‰

โœ”๏ธ slot : ํ•œ๊ฐœ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•  ์ˆ˜ ์žˆ๋Š” ๊ณต๊ฐ„



2. Hash Tables vs Arrays

๐Ÿค” hash์˜ ์‹œ๊ฐ„๋ณต์žก๋„๋Š” ์–ด๋–ป๊ฒŒ ๋ ๊นŒ?

โœ”๏ธ ํ•ด์‰ฌ๋Š” key๊ฐ’์„ ํ†ตํ•ด value๋ฅผ ๋ฐ”๋กœ ์ฝ์–ด์˜ฌ ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์žฅ๋ถ€์˜ ํ’ˆ๋ชฉ์ด ๋ฌดํ•œ๋Œ€๋กœ ๋Š˜์–ด๋‚œ๋‹คํ•ด๋„ O(1)์˜ ์‹œ๊ฐ„ ๋ณต์žก๋„๋ฅผ ๊ฐ€์ง„๋‹ค.

โœ”๏ธ ์‚ฌ์‹ค, O(1)์ด๋ผ๋Š” ์‹œ๊ฐ„๋ณต์žก๋„๋Š” Key๊ฐ’์„ ํ•ด์‰ฌ ํ•จ์ˆ˜์— ๋„ฃ์—ˆ์„ ๋•Œ, ์ถฉ๋Œ์ด ๋ฐœ์ƒ๋˜์ง€ ์•Š์„ ๋•Œ(๋ฐ˜ํ™˜๋˜๋Š” ํ•ด์‰ฌ ๊ฐ’์ด ์ค‘๋ณต์ด ์—†์„ ๋•Œ)์— ํ•ด๋‹น๋œ๋‹ค. ์ด์— ๋Œ€ํ•ด์„œ๋Š” ์•„๋ž˜์„œ ๋‹ค์‹œ ์–˜๊ธฐํ•ด๋ณด๋„๋ก ํ•˜์ž.

๐Ÿค” ๊ฐ„๋‹จ ํ•ด์‰ฌ ํ…Œ์ด๋ธ” ๊ตฌํ˜„ํ•˜๊ธฐ

โœ”๏ธ 10์ž๋ฆฌ์˜ ๊ฐ„๋‹จํ•œ ํ•ด์‰ฌ ํ…Œ์ด๋ธ”์€ ๋ฐฐ์—ด์„ ์‚ฌ์šฉํ•˜์—ฌ ์•„๋ž˜์ฒ˜๋Ÿผ ๋งŒ๋“ค์–ด ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

# ๋นˆ ํ•ด์‰ฌ ํ…Œ์ด๋ธ” ๋งŒ๋“ค๊ธฐ
hash_table = list([0 for i in range(10)])
print(hash_table) # [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

โœ”๏ธ ๊ฐ„๋‹จ ํ•ด์‰ฌ ํ…Œ์ด๋ธ” ๊ตฌํ˜„์ด๊ธฐ ๋•Œ๋ฌธ์— division ๋ฐฉ์‹์œผ๋กœ ํ•ด์‰ฌ ํ•จ์ˆ˜๊ฐ€ ์•„๋ž˜์ฒ˜๋Ÿผ ์กด์žฌํ•œ๋‹ค๊ณ  ๊ฐ€์ •ํ•ด๋ณด์ž.

# ๊ฐ„๋‹จ ํ•ด์‰ฌ ํ•จ์ˆ˜
# Division ๋ฐฉ์‹ : ๋‚˜๋ˆ„๊ธฐ๋ฅผ ํ†ตํ•œ ๋‚˜๋จธ์ง€ ๊ฐ’์„ ์‚ฌ์šฉํ•˜๋Š” ๊ธฐ๋ฒ•
def hash_func(key):
    return key % 5  # ๐Ÿ‘ˆ ์–ด๋–ค key๊ฐ’์ด ๋“ค์–ด์˜ค๋ฉด, ์ด๋ฅผ 5๋กœ ๋‚˜๋ˆˆ ๋‚˜๋จธ์ง€๋ฅผ ๋ฐ˜ํ™˜ํ•ด์š”!

โœ”๏ธ ํ•ด์‰ฌ ํ…Œ์ด๋ธ”์— ๊ฐ’์„ ์ €์žฅํ•˜๋Š” ํ•จ์ˆ˜๋Š” ์•„๋ž˜ ์ฒ˜๋Ÿผ ์ž‘์„ฑํ•  ์ˆ˜ ์žˆ๋‹ค. key๊ฐ’์ด string์ด๊ธฐ ๋•Œ๋ฌธ์— ord๋กœ ์ˆซ์ž๋กœ ๋ณ€ํ™˜ ํ›„, index๊ฐ’์„ ๊ตฌํ•˜๊ธฐ ์œ„ํ•ด ํ•ด์‰ฌ ํ•จ์ˆ˜์— ๋„ฃ๋Š”๋‹ค.

def storage_data(data, value): # ๐Ÿ‘ˆ ํ‚ค์™€ ์ €์žฅํ•  ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ›์Œ
    key = ord(data[0]) # ๐Ÿ‘ˆ ํ‚ค๋ฅผ ํ•ด์‰ฌํ•จ์ˆ˜์— ๋„ฃ์–ด ์—ฐ์‚ฐํ•˜๊ธฐ ์œ„ํ•ด ์ˆซ์ž๋กœ ๋ณ€ํ™˜
    hash_address = hash_func(key) # ๐Ÿ‘ˆ ํ•ด์‰ฌ ํ•จ์ˆ˜ ์ž‘๋™ํ•˜์—ฌ ํ•ด์‰ฌ๊ฐ’(ํ•ด์‰ฌ์ฃผ์†Œ) ์–ป์Œ
    hash_table[hash_address] = value # ๐Ÿ‘ˆ ํ•ด์‰ฌ ์ฃผ์†Œ๋ฅผ index๋กœํ•˜์—ฌ ๋ฐ์ดํ„ฐ๋ฅผ ํ…Œ์ด๋ธ”์— ์ €์žฅ
storage_data('Jaewon', '010-1234-5678')
storage_data('Haezin', '010-4444-9999')
storage_data('Ginsu', '010-8989-1212')
# ํ•ด์‰ฌ ํ…Œ์ด๋ธ” ์ €์žฅ ํ›„ ํ•ด์‰ฌ ํ…Œ์ด๋ธ” ์ƒํƒœ ํ™•์ธ
print(hash_table)
# [0, '010-8989-1212', '010-4444-9999', 0, '010-1234-5678', 0, 0, 0, 0, 0]

โœ”๏ธ ํ•ด์‰ฌ ํ…Œ์ด๋ธ”์— ๊ฐ’์„ ์ €์žฅํ–ˆ์œผ๋‹ˆ, ์ด๋ฅผ ์ฝ์–ด์˜ฌ ์ˆ˜ ์žˆ๋Š” ํ•จ์ˆ˜๋„ ์•„๋ž˜ ์ฒ˜๋Ÿผ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค.

# ํ•ด์‰ฌ ํ…Œ์ด๋ธ”์—์„œ ํŠน์ • ์ฃผ์†Œ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ ธ์˜ค๋Š” ํ•จ์ˆ˜
def get_data(data):
    key = ord(data[0]) # ๐Ÿ‘ˆ ๋ฐ์ดํ„ฐ์—์„œ Key๋ฅผ ์ถ”์ถœ
    hash_address = hash_func(key) # ๐Ÿ‘ˆ key๋ฅผ ํ•ด์‰ฌ ํ•จ์ˆ˜์— ๋„ฃ์–ด ํ•ด์‰ฌ ์ฃผ์†Œ๋ฅผ ์–ป์Œ
    return hash_table[hash_address] # ๐Ÿ‘ˆ ํ•ด์‰ฌ ์ฃผ์†Œ๋ฅผ ํ†ตํ•ด ๋ฐฐ์—ด์—์„œ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ˜ํ™˜
print(get_data('Jaewon')) # 010-1234-5678
print(get_data('Haezin')) # 010-4444-9999
print(get_data('Ginsu')) # 010-8989-1212

๐Ÿค” ํ•ด์‰ฌ, ๋ฐฐ์—ด, ์—ฐ๊ฒฐ ๋ฆฌ์ŠคํŠธ์˜ ์‹œ๊ฐ„ ๋ณต์žก๋„๋Š” ์–ด๋–ป๊ฒŒ ๋ ๊นŒ?

โœ”๏ธ ์‚ฌ์‹ค ํ•ด์‰ฌ ํ…Œ์ด๋ธ”์˜ ์‹œ๊ฐ„๋ณต์žก๋„๋Š” ํ‰๊ท ์ ์œผ๋กœ ๊ฐ€์ง€๋Š” ํ•ด์‰ฌ์˜ ์‹œ๊ฐ„๋ณต์žก๋„์™€ ์ตœ์•…์˜ ๊ฒฝ์šฐ์˜ ์‹œ๊ฐ„ ๋ณต์žก๋„๊ฐ€ ์กด์žฌํ•œ๋‹ค.

โœ”๏ธ ๊ฐ„๋‹จ ํ•ด์‰ฌ ํ…Œ์ด๋ธ”์˜ ์˜ˆ์‹œ๋ฅผ ํ†ตํ•ด ์ถฉ๋Œ์ด ์–ด๋–ป๊ฒŒ ๋ฐœ์ƒํ•˜๋Š”์ง€ ์‚ดํŽด๋ณด๋„๋ก ํ•˜๊ฒ ๋‹ค.

โœ”๏ธ ์˜ˆ๋ฅผ ๋“ค์–ด, key๊ฐ’์ด 'Jeawon'๊ณผ 'Jisu'๊ฐ€ ์žˆ๋‹ค๋ฉด, ๋‘˜๋‹ค 'J'๋ฅผ ord ํ•จ์ˆ˜๋กœ ์ฒ˜๋ฆฌํ•˜๊ธฐ ๋•Œ๋ฌธ์— 74๋ฅผ ๋ฐ˜ํ™˜๋ฐ›๋Š”๋‹ค. ์ด๋ฅผ ํ•ด์‰ฌ ํ•จ์ˆ˜์— ๋„ฃ์–ด 5๋กœ ๋‚˜๋ˆˆ ๋ชซ์€ ์„œ๋กœ ๋™์ผํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋ฎ์–ด์”Œ์–ด์ง€๋Š” ์ถฉ๋Œ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•œ๋‹ค.

โœ”๏ธ ์ถฉ๋Œ์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•œ ๋ฐฉ๋ฒ•์œผ๋กœ Open Hashing๋ฐฉ์‹์˜ Chaining ๊ธฐ๋ฒ• ๋˜๋Š” Close Hashing ๋ฐฉ์‹์˜ Linear Probing ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•˜๋Š”๋ฐ, ์ด๋Ÿด ๊ฒฝ์šฐ, ํ•ด๋‹นํ•˜๋Š” key๊ฐ’์„ ์ฐพ๊ธฐ ์œ„ํ•ด ๋‹ค์‹œ ๋ฐ˜๋ณต์ด ํ•„์š”ํ•˜๊ธฐ ๋•Œ๋ฌธ์— O(N)์˜ ์‹œ๊ฐ„๋ณต์žก๋„๋ฅผ ๊ฐ€์ง„๋‹ค.

โœ”๏ธ ๋”ฐ๋ผ์„œ ํ•ด์‰ฌ, ๋ฐฐ์—ด, ์—ฐ๊ฒฐ ๋ฆฌ์ŠคํŠธ์˜ ์‹œ๊ฐ„๋ณต์žก๋„๋Š” ์•„๋ž˜์™€ ๊ฐ™๋‹ค.



3. Hash Collisions

๐Ÿค” Close Hashing ๊ธฐ๋ฒ•

โœ”๏ธ ํ•ด์‰ฌ ์ถฉ๋Œ์„ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ Close Hashing ๊ธฐ๋ฒ•์œผ๋กœ ๋Œ€ํ‘œ์ ์ธ ๊ธฐ์ˆ ์€ Chaining์ด๋‹ค.

โœ”๏ธ Chaining์€ ํ•ด์‰ฌ ํ…Œ์ด๋ธ” ๊ณต๊ฐ„ ์™ธ์˜ ์ƒˆ๋กœ์šด ๊ณต๊ฐ„์„ ๋งˆ๋ จํ•˜์—ฌ ์ถฉ๋Œ์ด ๋ฐœ์ƒ ์‹œ ์—ฐ๊ฒฐ ๋ฆฌ์ŠคํŠธ ์ž๋ฃŒ ๊ตฌ์กฐ๋ฅผ ์‚ฌ์šฉํ•ด ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•˜๋Š” ๊ธฐ๋ฒ•์ด๋‹ค.

โœ”๏ธ ๋˜ํ•œ Chaining ๊ธฐ๋ฒ•์€ ๋ฐ์ดํ„ฐ๋ฅผ ์‹๋ณ„ํ•  ์ˆ˜ ์žˆ๋Š” key๊ฐ’์„ ํ•จ๊ป˜ ์ €์žฅํ•˜์—ฌ ์ด๋ฅผ ์‹๋ณ„ํ•˜๋Š”๋ฐ, ๋‹จ์ ์œผ๋กœ๋Š” ๊ณต๊ฐ„์ด ๋งŽ์ด ํ•„์š”ํ•˜๋‹ค.

# ํ•ด์‰ฌ ํ…Œ์ด๋ธ”
hash_table = list([0 for i in range(8)])
# ํ•ด์‰ฌ ํ‚ค ์ƒ์„ฑ
def get_key(data):
    return hash(data)
# ํ•ด์‰ฌ ํ•จ์ˆ˜
def hash_function(key):
    return key % 8
# ํ•ด์‰ฌ ๋ฐ์ดํ„ฐ ์ €์žฅ
def save_data(data, value):
    index_key = get_key(data) # ๐Ÿ‘ˆ index_key๋ผ๋Š” ๋ณ„๋„์˜ ๋ณ€์ˆ˜์— key๊ฐ’์„ ์ €์žฅ
    hash_address = hash_function(index_key)
    if hash_table[hash_address] != 0: # ๐Ÿ‘ˆ ์ด๋ฏธ ์กด์žฌํ•˜๋Š”์ง€ ํ™•์ธ
        for index in range(len(hash_table[hash_address])):
            if hash_table[hash_address][index][0] == index_key:
                hash_table[hash_address][index][1] = value
                return
        hash_table[hash_address].append([index_key, value])
    else: # hash_table[hash_address] == 0 ๐Ÿ‘ˆ ์—†์„ ๋•Œ
        hash_table[hash_address] = [[index_key, value]]
# ํ•ด์‰ฌ ๋ฐ์ดํ„ฐ ์กฐํšŒ
def read_data(data):
    index_key = get_key(data)
    hash_address = hash_function(index_key)
    if hash_table[hash_address] != 0:
        for index in range(len(hash_table[hash_address])):
            if hash_table[hash_address][index][0] == index_key:
                return hash_table[hash_address][index][1]
        return None
    else:
        return None
# ํ•ด์‰ฌ์ฃผ์†Œ ์ค‘๋ณต ๋ฐœ์ƒ <- ํ• ๋•Œ๋งˆ๋‹ค ๋ณ€ํ•จ
print(hash('Jaewon')%8) # 3
print(hash('Haezin')%8) # 2
print(hash('AaLa')%8)  # 3
# ์ €์žฅ
save_data('Jaewon', '01011112222')
save_data('Haezin', '01088889999')
save_data('AaLa', '01033331234')
print(hash_table)
# [0, 0, [[-3231082452735185358, '01088889999'], [630994963842225842, '01033331234']], [[-6162502635708335149, '01011112222']], 0, 0, 0, 0]
# ์กฐํšŒ
print(read_data('Jaewon')) # 01011112222
print(read_data('Haezin')) # 01088889999

๐Ÿค” Open Hashing

โœ”๏ธ Linear Probing ๊ธฐ๋ฒ•์€ Open Hashing์˜ ํ•œ ๊ธฐ์ˆ ๋กœ ํ•ด์‰ฌ ํ…Œ์ด๋ธ” ์ €์žฅ ๊ณต๊ฐ„ ์•ˆ์—์„œ ์ถฉ๋Œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๋Œ€ํ‘œ์ ์ธ ๋ฐฉ์‹์ด๋‹ค.

โœ”๏ธ ๋˜ํ•œ Linear Probing ๊ธฐ๋ฒ•์€ ์ถฉ๋Œ ๋ฐœ์ƒ ์‹œ, hash adress์˜ ๋‹ค์Œ adress๋ถ€ํ„ฐ ๋งจ ์ฒ˜์Œ ๋‚˜์˜ค๋Š” ๋นˆ ๊ณต๊ฐ„์— ์ €์žฅํ•˜๋Š” ๋ฐฉ์‹์ด๊ธฐ ๋•Œ๋ฌธ์— ์ €์žฅ๊ณต๊ฐ„ ํ™œ์šฉ๋„๊ฐ€ ๋†’๋‹ค.

# ํ•ด์‰ฌ ํ…Œ์ด๋ธ”
hash_table = list([0 for i in range(8)])
# ํ•ด์‰ฌ ํ‚ค ์ƒ์„ฑ
def get_key(data):
    return hash(data)
# ํ•ด์‰ฌ ํ•จ์ˆ˜
def hash_function(key):
    return key % 8
# ํ•ด์‰ฌ ๋ฐ์ดํ„ฐ ์ €์žฅ
def save_data(data, value):
    index_key = get_key(data)
    hash_address = hash_function(index_key)
    if hash_table[hash_address] != 0:
        for index in range(hash_address, len(hash_table)):
            if hash_table[index] == 0:
                hash_table[index] = [index_key, value]
                return
            elif hash_table[index][0] == index_key: # ์—…๋ฐ์ดํŠธ
                hash_table[index][1] = value
    else:
        hash_table[hash_address] = [index_key, value]
# ํ•ด์‰ฌ ๋ฐ์ดํ„ฐ ์กฐํšŒ
def read_data(data):
    index_key = get_key(data)
    hash_address = hash_function(index_key)
    if hash_table[hash_address] != 0:
        for index in range(hash_address, len(hash_table)):
            if hash_table[index] == 0:
                return None
            elif hash_table[index][0] == index_key:
                return hash_table[index][1]
    else:
        return None
# ํ•ด์‰ฌ์ฃผ์†Œ ์ค‘๋ณต ๋ฐœ์ƒ <- ํ• ๋•Œ๋งˆ๋‹ค ๋ณ€ํ•จ
print(hash('Jaewon')%8) # 0
print(hash('Haezin')%8) # 0
print(hash('AaLa')%8)  # 7
# ์ €์žฅ
save_data('Jaewon', '01011112222')
save_data('Haezin', '01088889999')
save_data('AaLa', '01033331234')
print(hash_table)
# [[3214169218863054112, '01011112222'], [-291548128366538008, '01088889999'], 0, 0, 0, 0, 0, [-143120377966334057, '01033331234']]
# ์กฐํšŒ
print(read_data('Jaewon')) # 01011112222
print(read_data('Haezin')) # 01088889999
profile
Keep Going, Keep Coding!

0๊ฐœ์˜ ๋Œ“๊ธ€