[AI] ๐Ÿธ Python Data Structure

Madeline๐Ÿ‘ฉ๐Ÿปโ€๐Ÿ’ปยท2022๋…„ 12์›” 6์ผ
0

AI study

๋ชฉ๋ก ๋ณด๊ธฐ
1/6

๋„ค์ด๋ฒ„ ๋ถ€์ŠคํŠธ์ฝ”์Šค [์ธ๊ณต์ง€๋Šฅ ๊ธฐ์ดˆ ๋‹ค์ง€๊ธฐ] ๊ฐ•์˜ ๊ธฐ๋ฐ˜์œผ๋กœ ์ž‘์„ฑํ•˜์˜€์Šต๋‹ˆ๋‹ค.

๋ฐ์ดํ„ฐ ๊ตฌ์กฐ

ํŒŒ์ด์ฌ ๊ธฐ๋ณธ ๋ฐ์ดํ„ฐ ๊ตฌ์กฐ

  • ์Šคํƒ, ํ
    -ํŠœํ”Œ, ์ง‘ํ•ฉ
    -์‚ฌ์ „
    -collection ๋ชจ๋“ˆ

1. Stack ์Šคํƒ

  • Last In First Out (LIFO ๋ฆฌํฌ ๊ตฌ์กฐ)
  • data push(input), data pop(output)
#Last In First Out (LIFO ๋ฆฌํฌ ๊ตฌ์กฐ)
#data push(input), data pop(output)
a = [1,2,3,4,5]
a.append(10)
#[1, 2, 3, 4, 5, 10]
a.append(20)
#[1, 2, 3, 4, 5, 10, 20]
a.pop()

2. Queue ํ

  • First In First Out (FIFO)
  • stack๊ณผ ๋ฐ˜๋Œ€๋˜๋Š” ๊ฐœ๋…
a = [1,2,3,4,5]
a.append(10)
a.append(20)
a.pop(0)

3. Tuple ํŠœํ”Œ

  • ๊ฐ’์˜ ๋ณ€๊ฒฝ์ด ๋ถˆ๊ฐ€๋Šฅํ•œ ๋ฆฌ์ŠคํŠธ
  • ๊ฐ’ ๋ฐ”๊พธ๋Š”๊ฑฐ ๋ง๊ณ  ๋ฆฌ์ŠคํŠธ ํ•จ์ˆ˜ ๋‹ค ๋˜‘๊ฐ™์ด ์”€
  • ํ”„๋กœ๊ทธ๋žจ ์ž‘๋™ ์‹œ ๋ณ€๊ฒฝ๋˜์ง€ ์•Š๋Š” ๋ฐ์ดํ„ฐ ์ €์žฅํ•  ๋•Œ ์”€, ์‹ค์ˆ˜์— ์˜ํ•œ ์—๋Ÿฌ ๋ฐฉ์ง€
  • ์„ ์–ธ์‹œ ()

4. Set ์ง‘ํ•ฉ

  • ์ˆœ์„œ ์ƒ๊ด€ x, ์ค‘๋ณต x

5. Dictionary ๋”•์…”๋„ˆ๋ฆฌ

  • ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•  ๋•Œ ๊ตฌ๋ถ„ ์ง€์„ ์ˆ˜ ์žˆ๋Š” ๊ฐ’์„ ํ•จ๊ป˜ ์ €์žฅ
  • identifier(key), value
  • key๋กœ value ๊ฒ€์ƒ‰(๋‹ค๋ฅธ ์–ธ์–ด์—์„œ๋Š” Hash Table๊ณผ ๋น„์Šทํ•จ)

6. Lab - Dict

๋”•์…”๋„ˆ๋ฆฌ ํƒ€์ž…์„ ํ™œ์šฉํ•ด์„œ ๋ฐ์ดํ„ฐ ๋ถ„์„ ์‹ค์Šต์„ ํ•ด๋ณด์ž

๊ฐ•์˜์ž๋ฃŒ์—์„œ ์ œ๊ณตํ•œ csv ํŒŒ์ผ์„ ๋‹ค์šด ๋ฐ›์€ ํ›„ ์ง„ํ–‰ํ–ˆ๋‹ค.

์‚ฌ์šฉํ•œ ํˆด์€ ๋น„์ฃผ์–ผ ์ฝ”๋“œ!!
์ฃผํ”ผํ„ฐ ๋…ธํŠธ๋ถ์œผ๋กœ ํ•˜๋‹ค๊ฐ€ ์•ˆ๋ผ์„œ ๋น„์ฃผ์–ผ ์ฝ”๋“œ๋กœ ๋ฐ”๊ฟจ๋‹ค.

  • command_counter: ๊ฐœ์ˆ˜ ์„ธ๋Š” ๋ณ€์ˆ˜
command_data = []
with open('command_data.csv','r', encoding="utf8") as csvfile:
    spamreader = csv.reader(csvfile, delimiter=',', quotechar = '"')
    for row in spamreader:
        command_data.append(row)
command_counter = {}        #dict ์ƒ์„ฑ
for data in command_data:   #list ๋ฐ์ดํ„ฐ๋ฅผ dict๋กœ ๋ณ€๊ฒฝ
    if data[1] in command_counter.keys():
        command_counter[data[1]] += 1
    else:
        command_counter[data[1]] = 1

์—ฌ๊ธฐ๊นŒ์ง€๋Š” data ๊ฐœ์ˆ˜ ์„ธ๋Š” ๊ฑฐ์—ฌ์„œ, ํ„ฐ๋ฏธ๋„์— ๊ฒฐ๊ณผ๊ฐ’์„ ํ™•์ธํ•ด๋ณด๋ฉด ๊ฐ key๊ฐ’๋“ค์ด ๋ช‡ ๋ฒˆ ๋“ฑ์žฅํ–ˆ๋Š”์ง€ ์ถœ๋ ฅ๋˜์–ด์žˆ์Œ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.

dictlist = []               #dict๋ฅผ list๋กœ ๋ณ€๊ฒฝ
for key, value in command_counter.items():
    temp = [key, value]
    dictlist.append(temp)
sorted_dict = sorted(dictlist, key=getKey, reverse = True)
print(sorted_dict[:100])

์ด ๋ถ€๋ถ„์€ dictlist์— key์™€ command_counter(๊ฐœ์ˆ˜)๋ฅผ ์ •๋ ฌํ•ด๋†“์€ ์ฝ”๋“œ์ด๋‹ค. ๊ฐœ์ˆ˜๊ฐ€ ๋งŽ์€ ์ˆœ์œผ๋ฅด๋กœ ์ƒ์œ„ 100๊ฐœ ๋‚ด๋ฆผ์ฐจ์ˆœ(reverse = True) ์ •๋ ฌํ•ด๋†“์•˜๋‹ค.

7. Collections

  • List, Tuple, Dict์— ๋Œ€ํ•œ python built-in ํ™•์žฅ ์ž๋ฃŒ๊ตฌ์กฐ
  • ํŽธ์˜์„ฑ, ์‹คํ–‰ ํšจ์œจ ๋“ฑ ์ œ๊ณต
  • ๋ชจ๋“ˆ ์˜ˆ์‹œ

- from collections import deque

  • Stack๊ณผ Queue๋ฅผ ์ง€์›ํ•˜๋Š” ๋ชจ๋“ˆ
  • List์— ๋น„ํ•ด ํšจ์œจ์ ์ธ(๋น ๋ฅธ) ์ž๋ฃŒ ์ €์žฅ ๋ฐฉ์‹ ์ง€์›
  • (rotate, reverse ๋“ฑ Linked List์˜ ํŠน์„ฑ์„ ์ง€์›)

deque_list = deque()
for i in range(5):
  deque_list.append(i)
print(deque_list)

//deque([0, 1, 2, 3, 4])

deque_list.appendleft(10)
print(deque_list)

//deque([10, 0, 1, 2, 3, 4])

deque_list.rotate(2)
print(deque_list)

//deque([3, 4, 10, 0, 1, 2])

deque_list.rotate(2)
print(deque_list)

//deque([1, 2, 3, 4, 10, 0])

print(deque(reversed(deque_list)))

//deque([0, 10, 4, 3, 2, 1])

deque_list.extend([5,6,7])
print(deque_list)
deque_list.extendleft([5,6,7])
print(deque_list)

//deque([1, 2, 3, 4, 10, 0, 5, 6, 7])
//deque([7, 6, 5, 1, 2, 3, 4, 10, 0, 5, 6, 7])

  • ํšจ์œจ์ ์ธ ๋ฉ”๋ชจ๋ฆฌ ๊ตฌ์กฐ๋กœ ์ฒ˜๋ฆฌ ์†๋„ ํ–ฅ์ƒ

//
Stack 0.1875 seconds
General List 0.546875 seconds
//

- from collections import OrderedDict

  • Dict์™€ ๋‹ฌ๋ฆฌ, ๋ฐ์ดํ„ฐ๋ฅผ ์ž…๋ ฅํ•œ ์ˆœ์„œ๋Œ€๋กœ dict๋ฅผ ๋ฐ˜ํ™˜
  • python 3.6๋ถ€ํ„ฐ๋Š” dict๋„ ์ˆœ์„œ ๋ณด์žฅํ•˜๋ฉฐ ์ถœ๋ ฅํ•จ -> no use,,

defaultdict

  • Dict type์˜ ๊ฐ’์— ๊ธฐ๋ณธ ๊ฐ’์„ ์ง€์ •
  • ์‹ ๊ทœ๊ฐ’ ์ƒ์„ฑ ์‹œ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•
d = dict()
print(d["first"])
# -> Key Error

from collections import defaultdict

d = defaultdict(object)     #Default dictionary ์ƒ์„ฑ
d = defaultdict(lambda: 0)  #Default = 0
print(d["first"])
# > 0
  • ํ•˜๋‚˜์˜ ์ง€๋ฌธ์— ๊ฐ ๋‹จ์–ด๋“ค์ด ๋ช‡ ๊ฐœ๋‚˜ ์žˆ๋Š”์ง€ ์„ธ๊ณ  ์‹ถ์„ ๊ฒฝ์šฐ
    -> Text mining ์ ‘๊ทผ๋ฒ•: Vector Space Model
text = """A Press release is the quickest and easiest way to get free publicity. If well written, a press release can result in multiple published articles
about your firm and its products.""".lower().split()
print(text)

// ['a', 'press', 'release', 'is', 'the', 'quickest', 'and', 'easiest', 'way', 'to', 'get', 'free', 'publicity.', 'if', 'well', 'written,', 'a', 'press', 'release', 'can', 'result', 'in', 'multiple', 'published', 'articles', 'about', 'your', 'firm', 'and', 'its', 'products.']

์‹ค์Šต~

- from collections import Counter

  • Sequence type์˜ data element๋“ค์˜ ๊ฐœ์ˆ˜๋ฅผ dictํ˜•ํƒœ๋กœ ๋ฐ˜ํ™˜
  • ๋ช‡ ๊ฐœ ์ธ์ง€ ์„ธ์ฃผ๋Š”๊ฑฐ
from collections import Counter

ball_or_strike_list = ["B","S","S","B","S","B","B"]
c = Counter(ball_or_strike_list)
print(c)

c = Counter({'red':4, 'blue':2})
print(list(c.elements()))

//Counter({'B': 4, 'S': 3})
//['red', 'red', 'red', 'red', 'blue', 'blue']

  • ์ง‘ํ•ฉ Set ์—ฐ์‚ฐ๋“ค์„ ์ง€์›ํ•จ
  • c,d๊ฐ€ counter()์ผ๋•Œ,
    //-
    c.subtract(d)
    //+
    c | d
    c + d
    //๊ต์ง‘ํ•ฉ
    c & d

- from collections import namedtuple

  • Tupleํ˜•ํƒœ๋กœ Data ๊ตฌ์กฐ์ฒด ์ €์žฅํ•˜๋Š” ๋ฐฉ๋ฒ•
  • ์ €์žฅ๋˜๋Š” Data variable์„ ์‚ฌ์ „์— ์ง€์ •ํ•˜์—ฌ ์ €์žฅ
from collections import namedtuple

Point = namedtuple('Point', ['x','y'])
p = Point(11, y = 22)
print(p[0] + p[1])
# > 33

x,y = p
print(x,y)
# > 11 22
print(p.x + p.y)
# > 33
print(Point(x=11, y=22))
# > Point(x=11, y=22)
profile
Major interest in iOS ๐Ÿ€ & ๐ŸŽ

0๊ฐœ์˜ ๋Œ“๊ธ€