valid_company_list.csv를 이용해서 데이터를 가져옴
valid_company_list.csv는 yahoo query 조회시 얻어올 수 있는 데이터를 저장한 파일
import
import pandas as pd
import numpy as np
import yfinance as yf
import datetime
import yahooquery as yq
from yahooquery import Ticker
from datetime import datetime, timedelta
symbol_id = pd.read_csv("valid_company_list.csv", encoding='euc-kr')
symbol_id
valid_company_list.csv 파일 형식
Information related to the company's location, operations, and officers.
asset_profile이란 dict를 생성한 후 해당 dict에 값을 넣는다.
회사에 따라서 존재하지 않는 값들이 있으므로 체크한 후 넣어준다.
회사에 대한 정보가 없을 경우 String 형태로 값이 들어온다.
특정 값만 없을 경우 해당 key가 빠져있다.
https://yahooquery.dpguthrie.com/guide/ticker/modules/#asset_profile
asset_profile = {
"country" : list(),
"industry" : list(),
"sector" : list(),
"phone" : list(),
"website" : list(),
}
non_data_symbols = list()
count = 0;
for idx, row in symbol_id.iterrows():
ticker = yq.Ticker(row["symbol"], backoff_factor=1)
print(row["symbol"])
if(type(ticker.asset_profile[row["symbol"]]) is not str):
if "country" in ticker.asset_profile[row["symbol"]].keys():
asset_profile["country"].append(ticker.asset_profile[row["symbol"]]["country"])
else: asset_profile["country"].append("None")
if "industry" in ticker.asset_profile[row["symbol"]].keys():
asset_profile["industry"].append(ticker.asset_profile[row["symbol"]]["industry"])
else: asset_profile["industry"].append("None")
if "sector" in ticker.asset_profile[row["symbol"]].keys():
asset_profile["sector"].append(ticker.asset_profile[row["symbol"]]["sector"])
else: asset_profile["sector"].append("None")
if "phone" in ticker.asset_profile[row["symbol"]].keys():
asset_profile["phone"].append(ticker.asset_profile[row["symbol"]]["phone"])
else: asset_profile["phone"].append("None")
if "website" in ticker.asset_profile[row["symbol"]].keys():
asset_profile["website"].append(ticker.asset_profile[row["symbol"]]["website"])
else: asset_profile["website"].append("None")
else: non_data_symbols.append(row["symbol"])
count += 1
non_data_symbols
# 출력결과
['094800.KS', '446070.KS', '109070.KS', '168490.KS']
print(len(asset_profile["country"]), len(asset_profile["industry"]), len(asset_profile["sector"]), len(asset_profile["phone"]), len(asset_profile["website"]))
df_symbol = pd.DataFrame.from_dict(asset_profile)
df_symbol
# 출력결과
country industry sector phone website
0 South Korea Rental & Leasing Services Industrials 82 2 6363 9999 https://www.ajurental.com
1 South Korea Chemicals Basic Materials 82 2 768 2923 https://www.aekyunggroup.co.kr
2 South Korea Grocery Stores Consumer Defensive 82 1 577 8007 https://www.bgfretail.com
3 South Korea Department Stores Consumer Cyclical 82 1 577 3663 https://www.bgf.co.kr
4 South Korea Banks—Regional Financial Services 82 5 1620 3000 https://www.bnkfg.com
... ... ... ... ... ...
1146 United States Utilities—Regulated Electric Utilities 612 330 5500 https://www.xcelenergy.com
1147 United States Software—Application Technology 385 203 4999 https://www.qualtrics.com
1148 United States Communication Equipment Technology 847 634 6700 https://www.zebra.com
1149 United States Software—Application Technology 888 799 9666 https://www.zoom.us
1150 United States Software—Infrastructure Technology 408 533 0288 https://www.zscaler.com
Data related to upgrades / downgrades by companies for a given symbol(s)
한국 데이터는 없음
반환 결과가 데이터프레임, 합쳐서 csv 파일로 반환
https://yahooquery.dpguthrie.com/guide/ticker/modules/#grading_history
grading_history= pd.concat([grading_history, ticker.grading_history])
Trend data related given symbol(s) index, specificially PE and PEG ratios
pe, peg를 구하기 위해 사용하지만, get_financial_data에서 가져올 수 있으므로 생략
Obtain specific data from either cash flow, income statement, balance sheet, or valuation measures.
get_financial_data 사용 시 오류
valuation_mesuar, balance_sheet, income_statement, cash_flow로 따로 가져올 경우 데이터가 없는 경우가 있음
all_financial_data를 사용해서 모든 데이터 가져온 후 분류
if(type(ticker.all_financial_data()) is pd.DataFrame):
financial_data = pd.concat([financial_data, ticker.all_financial_data()])
https://yahooquery.dpguthrie.com/guide/ticker/modules/#index_trend
Significant events related to a given symbol(s)
한국 데이터 없음
데이터 프레임 형태로 데이터 제공
corporate_events = pd.concat([corporate_events, ticker.corporate_events])
aapl = Ticker('aapl')
aapl.news(5)
recommendations = {
"symbol_id" : list(),
"simillar_symbol_id" : list(),
"score" : list()
}
recommendation_list = list()
for recommend in tickers.recommendations["001360.KS"]["recommendedSymbols"]:
recommendations["symbol_id"].append("001360.KS")
recommendations["simillar_symbol_id"].append(recommend["symbol"])
recommendations["score"].append(recommend["score"])
recommendations
위에서 가져오는 정보들은 대부분 Ticker로 가져옴.
중복 호출 방지를 위해서 한번에 처리하는 코드로 변환
에러 코드 발생 시 String 형태로 오므로 str인지 비교하는 구문을 생성함.
count 변수는 의미 없음 (중간 취소 및 확인을 위해 사용)
Ticker Data 수집
asset_profile = {
"symbol_id" : list(),
"country" : list(),
"industry" : list(),
"sector" : list(),
"phone" : list(),
"website" : list(),
}
long_business_summary = {
"symbol_id" : list(),
"summary" : list(),
}
recommendations = {
"symbol_id" : list(),
"simillar_symbol_id" : list(),
"score" : list()
}
grading_history = pd.DataFrame()
corporate_events = pd.DataFrame()
financial_data = pd.DataFrame()
non_data_symbols = list()
count = 0;
for idx, row in symbol_id.iterrows():
ticker = yq.Ticker(row["symbol"], backoff_factor=1)
print(row["symbol"])
#### Ticker - Asset Profile
if(type(ticker.asset_profile[row["symbol"]]) is not str):
asset_profile["symbol_id"].append(row["symbol"])
if "country" in ticker.asset_profile[row["symbol"]].keys():
asset_profile["country"].append(ticker.asset_profile[row["symbol"]]["country"])
else: asset_profile["country"].append("None")
if "industry" in ticker.asset_profile[row["symbol"]].keys():
asset_profile["industry"].append(ticker.asset_profile[row["symbol"]]["industry"])
else: asset_profile["industry"].append("None")
if "sector" in ticker.asset_profile[row["symbol"]].keys():
asset_profile["sector"].append(ticker.asset_profile[row["symbol"]]["sector"])
else: asset_profile["sector"].append("None")
if "phone" in ticker.asset_profile[row["symbol"]].keys():
asset_profile["phone"].append(ticker.asset_profile[row["symbol"]]["phone"])
else: asset_profile["phone"].append("None")
if "website" in ticker.asset_profile[row["symbol"]].keys():
asset_profile["website"].append(ticker.asset_profile[row["symbol"]]["website"])
else: asset_profile["website"].append("None")
#### Ticker - Long Business Summary
if "longBusinessSummary" in ticker.asset_profile[row["symbol"]].keys():
long_business_summary["symbol_id"].append(ticker.asset_profile[row["symbol"]])
long_business_summary["summary"].append(ticker.asset_profile[row["symbol"]]["longBusinessSummary"])
else: non_data_symbols.append(row["symbol"])
#### Ticker - Grading History
grading_history = pd.concat([grading_history, ticker.grading_history])
### Ticker - Corporate Events
if type(ticker.corporate_events) is not str:
corporate_events = pd.concat([corporate_events, ticker.corporate_events])
### Ticker - recommendations
if type(ticker.recommendations[row["symbol"]]) is not str:
for recommend in ticker.recommendations[row["symbol"]]["recommendedSymbols"]:
recommendations["symbol_id"].append(row["symbol"])
recommendations["simillar_symbol_id"].append(recommend["symbol"])
recommendations["score"].append(recommend["score"])
# Ticker - Financial Data
if(type(ticker.all_financial_data()) is pd.DataFrame):
financial_data = pd.concat([financial_data, ticker.all_financial_data()])
count += 1
df_asset_profile = pd.DataFrame.from_dict(asset_profile)
df_long_business_summary = pd.DataFrame.from_dict(long_business_summary)
df_recommendations = pd.DataFrame.from_dict(recommendations)
df_asset_profile.to_csv("C:/Users/SSAFY/Desktop/Project2/API_SCRIPT/asset_profile.csv")
df_long_business_summary.to_csv("C:/Users/SSAFY/Desktop/Project2/API_SCRIPT/long_business_summary.csv")
df_recommendations.to_csv("C:/Users/SSAFY/Desktop/Project2/API_SCRIPT/recommendations.csv")
grading_history.to_csv("C:/Users/SSAFY/Desktop/Project2/API_SCRIPT/grading_history.csv")
corporate_events.to_csv("C:/Users/SSAFY/Desktop/Project2/API_SCRIPT/corporate.csv")
financial_data.to_csv("C:/Users/SSAFY/Desktop/Project2/API_SCRIPT/financial_data.csv")