# ๐Ÿ“˜ Python์œผ๋กœ ๋‚ ์งœ ๋‹ค๋ฃจ๊ธฐ์™€ CSV ํŒŒ์ผ ํŒŒ์‹ฑ, ๊ทธ๋ฆฌ๊ณ  ์‹œ๊ฐํ™”๊นŒ์ง€ ํ•œ ๋ฒˆ์— ์ •๋ฆฌ!

Yeeunยท2025๋…„ 4์›” 25์ผ

Python

๋ชฉ๋ก ๋ณด๊ธฐ
16/31

ํŒŒ์ด์ฌ์„ ๊ณต๋ถ€ํ•˜๋ฉด์„œ ์ž์ฃผ ์ ‘ํ•˜๊ฒŒ ๋˜๋Š” ๊ฐœ๋… ์ค‘ ํ•˜๋‚˜๊ฐ€ ๋‚ ์งœ(datetime)์™€ CSV ํŒŒ์ผ ์ฒ˜๋ฆฌ, ๊ทธ๋ฆฌ๊ณ  ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•œ ์‹œ๊ฐํ™”์ž…๋‹ˆ๋‹ค.
์ด๋ฒˆ ํฌ์ŠคํŠธ์—์„œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ ํ๋ฆ„์œผ๋กœ ๊ฐœ๋…์„ ์ •๋ฆฌํ•˜๊ณ , ๊ด€๋ จ ์ฝ”๋“œ๋ฅผ ํ•จ๊ป˜ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.


๐Ÿงญ ๋ชฉ์ฐจ

  1. ๋‚ ์งœ ๋‹ค๋ฃจ๊ธฐ (datetime)
  2. ๋ฌธ์ž์—ด โ†” ๋‚ ์งœ ๋ณ€ํ™˜
  3. ๋‚ ์งœ ๋”ํ•˜๊ธฐ / ๋ฐ˜๋ณต ์ƒ์„ฑ
  4. CSV ํŒŒ์ผ ์ฝ๊ธฐ (Pathlib + csv)
  5. CSV ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ (list, int ๋ณ€ํ™˜)
  6. DictReader ํ™œ์šฉ
  7. pandas์™€์˜ ์ฐจ์ด์ 
  8. ์˜ˆ์™ธ ์ฒ˜๋ฆฌ ๋ฐ ์‹ค์ „ ํ•จ์ˆ˜
  9. ๋งˆ๋ฌด๋ฆฌ ์š”์•ฝ (๐Ÿ“‹ ํ‘œ ์ •๋ฆฌ)

1. ๐Ÿ“† ๋‚ ์งœ ๋‹ค๋ฃจ๊ธฐ (datetime ๋ชจ๋“ˆ)

โœ”๏ธ ์˜ค๋Š˜ ๋‚ ์งœ ๊ตฌํ•˜๊ธฐ

import datetime as dt

print(dt.datetime.today())   # ex) 2025-04-25 13:11:59.123456

โœ”๏ธ ์˜ค๋Š˜ ๋‚ ์งœ๋ฅผ ๋ฌธ์ž์—ด๋กœ ๋ณ€ํ™˜

dt.datetime.today().strftime('%Y%m%d')   # '20250425'

2. ๐Ÿ” ๋ฌธ์ž์—ด์„ ๋‚ ์งœ๋กœ ๋ณ€ํ™˜ํ•˜๊ธฐ

date_str = '2025-04-25'
parsed = dt.datetime.strptime(date_str, '%Y-%m-%d')
print(parsed)   # datetime.datetime(2025, 4, 25, 0, 0)

๐Ÿ“ strptime์˜ ํฌ๋งท์„ ์ž˜๋ชป ์ง€์ •ํ•˜๋ฉด ์—๋Ÿฌ๊ฐ€ ๋‚˜๊ธฐ ์‰ฌ์šฐ๋ฏ€๋กœ ์ฃผ์˜!


3. ๐Ÿ“… ๋‚ ์งœ ๋”ํ•˜๊ธฐ & ๋ฐ˜๋ณต ์ƒ์„ฑ

โœ”๏ธ ํ•˜๋ฃจ ๋”ํ•œ ๋‚ ์งœ

dt.datetime.today() + dt.timedelta(days=1)

โœ”๏ธ ์˜ค๋Š˜๋ถ€ํ„ฐ 100์ผ๊ฐ„ ๋‚ ์งœ ์ถœ๋ ฅ

for i in range(100):
    print((dt.datetime.today() + dt.timedelta(days=i)).strftime('%Y%m%d'))

4. ๐Ÿ“ CSV ํŒŒ์ผ ์ฝ๊ธฐ (Pathlib + csv.reader)

from pathlib import Path
import csv

path = Path('../the_csv_file_format/weather_data/death_valley_2021_full.csv')
lines = path.read_text().splitlines()
reader = csv.reader(lines)

โœ”๏ธ ํ—ค๋” ์ถ”์ถœ ๋ฐ ์—ด ์ธ๋ฑ์Šค ํ™•์ธ

header_row = next(reader)

for index, column_header in enumerate(header_row):
    print(index, column_header)

๐Ÿ” ์ด๋ ‡๊ฒŒ ํ•˜๋ฉด ๊ฐ ์—ด์ด ์–ด๋–ค ๋ฐ์ดํ„ฐ์ธ์ง€ ์‰ฝ๊ฒŒ ํ™•์ธํ•  ์ˆ˜ ์žˆ์–ด์š”.


5. ๐Ÿ“Š CSV ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ (์ˆซ์žํ˜• ๋ณ€ํ™˜)

csv.reader๋Š” ๊ฐ ์ค„์„ ๋ฌธ์ž์—ด ๋ฆฌ์ŠคํŠธ๋กœ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. ์ˆซ์ž๋กœ ๊ณ„์‚ฐํ•˜๋ ค๋ฉด ์ง์ ‘ int() ๋ณ€ํ™˜์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

highs = []
for row in reader:
    high = int(row[4])  # 4๋ฒˆ ์—ด: ์ตœ๊ณ ๊ธฐ์˜จ
    highs.append(high)

6. ๐Ÿ“‘ DictReader๋กœ ๋” ํŽธํ•˜๊ฒŒ!

csv.DictReader๋Š” ํ—ค๋”๋ฅผ ํ‚ค๋กœ ๊ฐ–๋Š” ๋”•์…”๋„ˆ๋ฆฌ๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

with open('../the_csv_file_format/weather_data/death_valley_2021_full.csv') as f:
    reader = csv.DictReader(f)
    for row in reader:
        print(row['DATE'])  # ์—ด ์ด๋ฆ„์œผ๋กœ ๋ฐ”๋กœ ์ ‘๊ทผ

โœ… ๊ฐ€๋…์„ฑ์ด ํ›จ์”ฌ ์ข‹์•„์ ธ์š”!


7. ๐Ÿผ pandas์™€์˜ ์ฐจ์ด์ 

ํ•ญ๋ชฉcsv.readerpandas.read_csv
๋ฐ์ดํ„ฐ ํƒ€์ž…๋ฌธ์ž์—ด (์ˆ˜๋™ ํ˜•๋ณ€ํ™˜ ํ•„์š”)์ˆซ์žํ˜• ์ž๋™ ๋ณ€ํ™˜
์ ‘๊ทผ ๋ฐฉ์‹์ธ๋ฑ์Šค (row[4])์—ด ์ด๋ฆ„ (df['TMAX'])
๊ธฐ๋Šฅ๋‹จ์ˆœํ†ต๊ณ„, ๊ฒฐ์ธก์น˜ ์ฒ˜๋ฆฌ, ์‹œ๊ฐํ™” ๊ฐ€๋Šฅ
๋ฐ˜ํ™˜๋ฆฌ์ŠคํŠธDataFrame
import pandas as pd

df = pd.read_csv('../the_csv_file_format/weather_data/death_valley_2021_full.csv')
print(df['TMAX'].values)  # ์ž๋™์œผ๋กœ int ์ฒ˜๋ฆฌ๋จ

8. โš ๏ธ ์‹ค์ „ ์˜ˆ์™ธ ์ฒ˜๋ฆฌ ํฌํ•จ ํ•จ์ˆ˜

def load_date(filename):
    import datetime as dt
    import csv

    with open(filename) as f:
        reader = csv.reader(f)
        next(reader)  # Skip header

        tdate, tmin, tmax = [], [], []

        for line in reader:
            try:
                date = dt.datetime.strptime(line[2], '%Y%m%d')
                t_min = int(line[4])
                t_max = int(line[5])
            except (ValueError, IndexError):
                continue
            else:
                tdate.append(date)
                tmin.append(t_min)
                tmax.append(t_max)

    return tdate, tmin, tmax

9. ๐Ÿงพ ๋งˆ๋ฌด๋ฆฌ ์š”์•ฝ

๊ธฐ๋Šฅ๋ฐฉ๋ฒ•
๋‚ ์งœ โ†’ ๋ฌธ์ž์—ดstrftime()
๋ฌธ์ž์—ด โ†’ ๋‚ ์งœstrptime()
๋‚ ์งœ ๋”ํ•˜๊ธฐ+ timedelta(days=1)
CSV ์ฝ๊ธฐcsv.reader() / csv.DictReader()
์—ด ํ™•์ธenumerate(header_row)
๋ฌธ์ž์—ด โ†’ ์ˆซ์ž ๋ณ€ํ™˜int(row[4])
pandas ์ฐจ์ด์ ์ž๋™ ํ˜•๋ณ€ํ™˜, .values ์ง€์›

๐Ÿ“ˆ ์ถ”๊ฐ€: ๊ฐ„๋‹จํ•œ ์‹œ๊ฐํ™” ์˜ˆ์ œ

import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(0, 2 * np.pi, 100)
y = np.sin(x)

plt.plot(x, y, label='sin(x)')
plt.xlabel("x (radians)")
plt.ylabel("sin(x)")
plt.grid()
plt.legend()
plt.title("Sine Curve")
plt.show()

โœ… ์ •๋ฆฌ

์ด ๊ธ€์—์„œ๋Š” Python์„ ์ด์šฉํ•ด ๋‚ ์งœ๋ฅผ ๋‹ค๋ฃจ๋Š” ๋ฐฉ๋ฒ•, CSV ๋ฐ์ดํ„ฐ๋ฅผ ์•ˆ์ „ํ•˜๊ฒŒ ํŒŒ์‹ฑํ•˜๋Š” ๋ฐฉ๋ฒ•, pandas์™€์˜ ์ฐจ์ด, ๊ทธ๋ฆฌ๊ณ  ๊ฐ„๋‹จํ•œ ์‹œ๊ฐํ™”๊นŒ์ง€ ๋‹ค๋ค„๋ดค์Šต๋‹ˆ๋‹ค.

์ด๋Ÿฐ ํ๋ฆ„์„ ์ตํ˜€๋‘๋ฉด ๋ฐ์ดํ„ฐ ๋ถ„์„, ๊ธฐ์ƒ ์ •๋ณด ์‹œ๊ฐํ™”, ํ†ต๊ณ„ ์ฒ˜๋ฆฌ ๋“ฑ์— ํ›จ์”ฌ ๋น ๋ฅด๊ฒŒ ์ ์‘ํ•  ์ˆ˜ ์žˆ์–ด์š”!


0๊ฐœ์˜ ๋Œ“๊ธ€