Python - 웹크롤링

갓김치·2020년 11월 26일

201126

Python 수업

목록 보기

1/2

http client
- https://docs.python.org/ko/3/library/http.client.html
request, bs4
- https://rednooby.tistory.com/97
- https://dgkim5360.tistory.com/entry/python-requests
검색
- https://developers.kakao.com/docs/latest/ko/daum-search/dev-guide#search-image
xml 문자열파싱
- https://m.blog.naver.com/wideeyed/221843262329
- https://wikidocs.net/21140

삼전주식

1분에 하나씩 db에 저장 -> 나중에 db정보를 보고 ai가 분석
https://pyther.tistory.com/11?category=638814

셀레니움

https://beomi.github.io/2017/02/27/HowToMakeWebCrawler-With-Selenium/

PLOT 그래프

https://bcho.tistory.com/1201
http://wooorazil.blogspot.com/2015/07/pymssql-pymssqloperationalerror-20017.html

mmsql에서 select로 긁어올때 한글 깨짐

cursor.execute("SELECT s_name, s_code, s_price, in_time FROM stock where s_name = 'LG전자' order by in_time;")

# 데이터 하나씩 Fetch하여 출력
conn = pymssql.connect(server="SERVER",user="USER",password="PW",database="DBNAME",charset="UTF-8")
cursor = conn.cursor()

time = []
lg = []
ss = []

row = cursor.fetchone()
while row:
    time.append(row[3][9:])
    lg.append(row[2])
    print(row[0])
    
    # 디코딩
    t = row[0]
    t = t.encode('ISO-8859-1')
    t = t.decode('EUC-KR')
    print(t)
    row=cursor.fetchone()```
# 엑셀로 저장
https://m.blog.naver.com/pmw9440/221849471131
## 엑셀 저장시 옵션
https://wikidocs.net/43282
- 인덱스 나오는게 싫으면 ``index=False`` 옵션 추가
```python
with pd.ExcelWriter(xlxs_dir) as writer:
    raw_data1.to_excel(writer, sheet_name = 'raw_data1', index=False) #raw_data1 시트에 저장

https://ponyozzang.tistory.com/617

아니면 reset_index()

import pandas as pd

df = pd.DataFrame({'name': ['Alice','Bob','Charlie','Dave','Ellen','Frank'],
                   'age': [24,42,18,68,24,30],
                   'state': ['NY','CA','CA','TX','CA','NY'],
                   'point': [64,24,70,70,88,57]}
                  )

# 인덱스를 name로 지정
df.set_index('name', inplace=True)
print(df)
#          age state  point
# name
# Alice     24    NY     64
# Bob       42    CA     24
# Charlie   18    CA     70
# Dave      68    TX     70
# Ellen     24    CA     88
# Frank     30    NY     57

# 인덱스를 state로 변경
df_rs = df.reset_index().set_index('state')
print(df_rs)

갓김치

갈 길이 멀다

Python - 웹크롤링

Python 수업

삼전주식

셀레니움

PLOT 그래프

0개의 댓글