9.2. Yahoo! FINANCEのデータを用いた可視化の演習#

  • ここからは米国のYahoo! FINANCEの株価データを用いて相関関係の可視化を行います。

  • 今日現在までの直近1年間のtech_stockおよび日系企業の株価データを用います。

  • PythonでYahoo! FINANCEのデータを集めるパッケージyfinanceを用います。

Warning

ここで用いるパッケージyfinanceはYahooが公開しているAPIを利用したオープンソースのツールであり、研究・教育目的での利用を想定しています。 ダウンロードした実際のデータを使用する権利の詳細については、ヤフーの利用規約を参照する必要があります(Yahoo Developer API Terms of Use; Yahoo Terms of Service; Yahoo Terms)。

  • ここで紹介する企業以外を試す場合は、Yahoo! FINANCEのページで企業の銘柄コードを確認して置き換えてください。

Warning

必要なパッケージをインストールします。すでに、requirements.txtを用いてパッケージ類をインストールしている場合は、そのまま次のセルに進んでください。

まだの場合は、こちらのページを用いてパッケージ類をインストールしてください。もしくは、以下の方法をお試しください。

  1. WinodwsのかたはWindowsのメニューからAnaconda Propmtを、Macの方はTerminalを起動させ、

  2. conda install git を入力してエンターを押してください。しばらくするとProceed ([y]/n)? と表示されるのでyを入力してエンターを押して続行してください。

  3. pip install git+https://github.com/pydata/pandas-datareaderを入力しエンターを押して、pandas-datareaderのインストールを実行します。

  4. Terminal上で、pip install yfinance --upgrade --no-cache-dir を入力しエンターを押してyfinanceのインストールを実行します。

from datetime import datetime
import os
import pandas as pd
from pandas_datareader import data as pdr
import matplotlib.pyplot as plt
import numpy as np
import yfinance as yf
end = datetime.now()
start =  datetime(end.year-1, end.month, end.day)

yf.pdr_override()

tech_stock = ['GOOG', 'AAPL', 'META', 'AMZN', 'NFLX', 'TSLA'] 

for company in tech_stock:
    globals()[company] = pdr.get_data_yahoo(tickers=company, start=start, end=end)
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed

[*********************100%%**********************]  1 of 1 completed

[*********************100%%**********************]  1 of 1 completed

[*********************100%%**********************]  1 of 1 completed

[*********************100%%**********************]  1 of 1 completed

Open:始値
High:高値
Low:安値
Close:終値
Volume:出来高(1日に取引が成立した株の数)
Adj Close:調整後終値

GOOG.describe()
Open High Low Close Adj Close Volume
count 250.000000 250.000000 250.000000 250.000000 250.000000 2.500000e+02
mean 164.983688 166.702312 163.429972 165.065400 164.667500 1.975175e+07
std 15.350093 15.567643 15.179647 15.360177 15.432170 8.323973e+06
min 132.740005 134.020004 131.550003 132.559998 132.085388 6.809800e+06
25% 152.992496 154.889996 151.707497 153.197502 152.649017 1.456728e+07
50% 166.119995 167.770004 164.775002 166.139999 165.792549 1.752165e+07
75% 175.824005 178.022495 174.869995 176.419998 175.998856 2.166620e+07
max 198.529999 202.880005 196.690002 198.160004 198.160004 5.972800e+07
GOOG.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 250 entries, 2024-01-02 to 2024-12-27
Data columns (total 6 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   Open       250 non-null    float64
 1   High       250 non-null    float64
 2   Low        250 non-null    float64
 3   Close      250 non-null    float64
 4   Adj Close  250 non-null    float64
 5   Volume     250 non-null    int64  
dtypes: float64(5), int64(1)
memory usage: 13.7 KB
# Google の調整済終値のプロット
GOOG['Adj Close'].plot(legend=True, figsize=(10,4))
plt.title("Google Adjusted Closing Price", fontsize=15)
plt.ylabel('price (USD)')
plt.grid()
plt.show()
../../_images/e9840ced40de792239338faf23cbf21c8482a0bae83a1a63595740c28f22b4c7.png

9.2.1. 移動平均(Moving Average)#

時系列データで一定区間ごとの平均値を区間をずらしながら求めたもの

ma_day = [10, 20, 30] # 10日、20日、50日の移動平均の値を持つ新しいcolumn(MA_10, MA_20, MA_50)を作ります
for ma in ma_day:
    for company in [GOOG, AAPL, META, NFLX, AMZN]:
        company['MA_{}'.format(ma)] = company['Adj Close'].rolling(ma).mean() #rolling(日数).mean()で日数の移動平均を求めます
AAPL.head(3)
Open High Low Close Adj Close Volume MA_10 MA_20 MA_30
Date
2024-01-02 187.149994 188.440002 183.889999 185.639999 184.734985 82488700 NaN NaN NaN
2024-01-03 184.220001 185.880005 183.429993 184.250000 183.351761 58414500 NaN NaN NaN
2024-01-04 182.149994 183.089996 180.880005 181.910004 181.023178 71983600 NaN NaN NaN
AAPL[['Adj Close','MA_10', 'MA_20','MA_30']].plot(subplots=False, figsize=(10,5))
plt.title('Moving Average (10 days, 20 days, 30 days windows)')
plt.ylabel('price (USD)')
plt.grid()
plt.show()
../../_images/1c7e4cba824fb0a449e632d3e291aba0ddcd3a29ad30cc284d45a04c82bb5af2.png

9.2.2. 参考: 株価の前日からのパーセント変化を求めます#

for company in [GOOG, AAPL, META, AMZN, NFLX,TSLA]:
    company['returns'] = company['Adj Close'].pct_change()
colors = ['orange','black','blue','red','yellow','green']
i=0
plt.figure(figsize=(8,5))
for company in [GOOG, AAPL, META, AMZN, NFLX, TSLA]:
    plt.hist(company['returns'].dropna(),bins=100,color=colors[i],alpha = 0.2, label=tech_stock[i])
    i += 1
plt.legend()
plt.xlabel('Percentage change')
plt.ylabel('Frequency')
plt.grid(axis='x')
plt.show()
../../_images/9bd71bfc5ea24c181970253927f6e4c09884890eb7649362cf72a5f7bb8ac598.png

ヒストグラムで6社の変化率を上のように示すと、6社とも多くの日で前日からの変化率は??%以内。

6社の終値を格納したDataFrameを作成します

tech_stock_close = pd.DataFrame({'GOOG':GOOG['Adj Close'],
                           'AAPL':AAPL['Adj Close'],
                           'META': META['Adj Close'],
                           'AMZN': AMZN['Adj Close'],
                           'NFLX': NFLX['Adj Close'],
                           'TSLA': TSLA['Adj Close']})
tech_stock_close.describe()
GOOG AAPL META AMZN NFLX TSLA
count 250.000000 250.000000 250.000000 250.000000 250.000000 250.000000
mean 164.667500 206.413754 507.638002 184.342960 669.705240 229.174880
std 15.432170 25.505410 62.309650 17.199507 108.823840 69.406924
min 132.085388 164.405121 343.159119 144.570007 468.500000 142.050003
25% 152.649017 183.385338 474.307808 175.360004 608.745010 179.995003
50% 165.792549 213.687302 503.563019 183.260002 647.629974 210.630005
75% 175.998856 227.032825 562.201508 189.395000 706.632492 248.372498
max 198.160004 259.019989 632.170044 232.929993 936.559998 479.859985
tech_stock_close.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 250 entries, 2024-01-02 to 2024-12-27
Data columns (total 6 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   GOOG    250 non-null    float64
 1   AAPL    250 non-null    float64
 2   META    250 non-null    float64
 3   AMZN    250 non-null    float64
 4   NFLX    250 non-null    float64
 5   TSLA    250 non-null    float64
dtypes: float64(6)
memory usage: 13.7 KB

9.2.2.1. GoogleとApple#

GoogleとApple の直近1年間の株価終値の相関係数を求めます。

tech_stock_close['GOOG'].corr(tech_stock_close['AAPL'])
0.6755733160023757

GoogleとApple の直近1年間の株価終値の相関行列を示します。

tech_stock_close[['GOOG','AAPL']].corr()
GOOG AAPL
GOOG 1.000000 0.675573
AAPL 0.675573 1.000000

GoogleとApple の直近1年間の株価終値の散布図を示します。

plt.figure(figsize=(5,5))
plt.scatter(tech_stock_close['GOOG'],tech_stock_close['AAPL'],color ='y',alpha=0.5)
plt.xlabel('Google')
plt.ylabel('Apple')
plt.title("Closing prices of Google and Apple")
plt.show()
../../_images/c773596588f674a32af0c3099f45b44b1feb1ea036a2fce0952c950d2be43c37.png

9.2.2.2. ヒストグラムと散布図を1つの図中に示す方法#

import seaborn as sns

sns.jointplot(data=tech_stock_close, x='GOOG', y='TSLA')
plt.show()
/opt/anaconda3/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/opt/anaconda3/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
../../_images/36540cb17b68a55b5c35d769be8f7532ba82b151320d8f2b90744ff0317da197.png
sns.pairplot(tech_stock_close, plot_kws=dict(color = 'b', edgecolor='b', alpha = 0.2))
plt.show()
/opt/anaconda3/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/opt/anaconda3/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/opt/anaconda3/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/opt/anaconda3/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/opt/anaconda3/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/opt/anaconda3/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
../../_images/38649b524be7f0c18ed1b083a1dfe6fdb5f1a1c845af16b166ff829ad8243fda.png

6社の相関行列を示します。

tech_stock_close.corr()
GOOG AAPL META AMZN NFLX TSLA
GOOG 1.000000 0.675573 0.545528 0.762449 0.661126 0.542634
AAPL 0.675573 1.000000 0.699146 0.642521 0.795830 0.765408
META 0.545528 0.699146 1.000000 0.823320 0.888165 0.606519
AMZN 0.762449 0.642521 0.823320 1.000000 0.897526 0.750186
NFLX 0.661126 0.795830 0.888165 0.897526 1.000000 0.825915
TSLA 0.542634 0.765408 0.606519 0.750186 0.825915 1.000000

9.2.2.2.1. 参考#

続いて日本の自動車メーカーのToyota Motor CorporationとHonda Motor Co., Ltd.の直近1年の株価も収集します。

end = datetime.now()
start =  datetime(end.year-1, end.month, end.day)

yf.pdr_override()

vehicles = ['TM', 'HMC'] # TM : Toyota Motor Corporation, HMC: Honda Motor Co., Ltd.
for company in vehicles:
    globals()[company] = pdr.get_data_yahoo(tickers=company, start=start, end=end) 
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed

print(TM.info(), HMC.info())
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 250 entries, 2024-01-02 to 2024-12-27
Data columns (total 6 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   Open       250 non-null    float64
 1   High       250 non-null    float64
 2   Low        250 non-null    float64
 3   Close      250 non-null    float64
 4   Adj Close  250 non-null    float64
 5   Volume     250 non-null    int64  
dtypes: float64(5), int64(1)
memory usage: 13.7 KB
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 250 entries, 2024-01-02 to 2024-12-27
Data columns (total 6 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   Open       250 non-null    float64
 1   High       250 non-null    float64
 2   Low        250 non-null    float64
 3   Close      250 non-null    float64
 4   Adj Close  250 non-null    float64
 5   Volume     250 non-null    int64  
dtypes: float64(5), int64(1)
memory usage: 13.7 KB
None None
tm_hmc = pd.DataFrame({'TM':TM['Adj Close'],'HMC':HMC['Adj Close']})
tech_stock_th = pd.concat([tech_stock_close, tm_hmc], axis=1)
tech_stock_th.sample(4)
GOOG AAPL META AMZN NFLX TSLA TM HMC
Date
2024-04-22 157.384491 165.242081 480.405975 177.229996 554.599976 142.050003 230.300003 34.549999
2024-09-09 149.370529 220.667221 503.902435 175.399994 675.419983 216.270004 176.080002 31.530001
2024-02-26 138.253235 180.506866 480.415955 174.729996 587.650024 199.399994 238.130005 35.660000
2024-05-13 170.288132 185.860138 466.723724 186.570007 616.590027 171.889999 215.639999 33.790001
# データを保存
os.makedirs('./data', exist_ok=True)
tech_stock_close.to_csv('./data/tech_stock_close.csv')
tech_stock_close.to_pickle('./data/tech_stock_close.pkl')

tech_stock4社と自動車メーカー2社の相関行列を示します。

tech_stock_th.corr()
GOOG AAPL META AMZN NFLX TSLA TM HMC
GOOG 1.000000 0.675573 0.545528 0.762449 0.661126 0.542634 -0.413725 -0.583584
AAPL 0.675573 1.000000 0.699146 0.642521 0.795830 0.765408 -0.838156 -0.801696
META 0.545528 0.699146 1.000000 0.823320 0.888165 0.606519 -0.429586 -0.514580
AMZN 0.762449 0.642521 0.823320 1.000000 0.897526 0.750186 -0.293303 -0.603807
NFLX 0.661126 0.795830 0.888165 0.897526 1.000000 0.825915 -0.550100 -0.759890
TSLA 0.542634 0.765408 0.606519 0.750186 0.825915 1.000000 -0.593509 -0.846852
TM -0.413725 -0.838156 -0.429586 -0.293303 -0.550100 -0.593509 1.000000 0.827525
HMC -0.583584 -0.801696 -0.514580 -0.603807 -0.759890 -0.846852 0.827525 1.000000

9.2.2.3. 参考: ヒートマップで相関関係を示す#

変数が多い場合視覚的にわかりやすい

def CorrMtx(df, dropDuplicates = True):

    if dropDuplicates:    
        mask = np.zeros_like(df, dtype=bool)
        mask[np.triu_indices_from(mask,1)] = True

    sns.set_style(style = 'white')

    fig, ax = plt.subplots(figsize=(7, 7))

    cmap = sns.diverging_palette(250, 10, as_cmap=True)

    if dropDuplicates:
        sns.heatmap(df, mask=mask, vmin=-1, vmax=1,annot=True,cmap=cmap)
    else:
        sns.heatmap(df, vmin=-1, vmax=1,annot=True,cmap=cmap)


CorrMtx(tech_stock_th.corr(), dropDuplicates = True)
/opt/anaconda3/lib/python3.11/site-packages/seaborn/matrix.py:260: FutureWarning: Format strings passed to MaskedConstant are ignored, but in future may error or produce different behavior
  annotation = ("{:" + self.fmt + "}").format(val)
../../_images/e415dbc5b999c695a1cdbb7655e60d702f5150968212f2a3ae373b727c06ab75.png

AppleとToyotaの散布図を示します。

sns.jointplot(x='AAPL', y='TM', data=tech_stock_th, color="purple", alpha = 0.5)
plt.show()
/opt/anaconda3/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/opt/anaconda3/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
../../_images/86479f76df256e69c282c5449ec69e9ab91feacb4abd6a20916282153275ebc9.png

HondaとToyotaの散布図を示します。

sns.jointplot(x='HMC', y='TM', data=tech_stock_th, color="orange", alpha = 0.5)
plt.show()
/opt/anaconda3/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/opt/anaconda3/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
../../_images/e56bd425ce90ddb4b7e3d45b1e4250ac4d6b1fe8b24fcae0bf4d070527dabdff.png
sns.pairplot(tech_stock_th, plot_kws=dict(color = 'blue', edgecolor='b', alpha = 0.2))
plt.show()
/opt/anaconda3/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/opt/anaconda3/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/opt/anaconda3/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/opt/anaconda3/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/opt/anaconda3/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/opt/anaconda3/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/opt/anaconda3/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/opt/anaconda3/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
../../_images/583990e00dff12f986f04a1f2b9a26f015158f37c10d8f1fb1728c78ffe59435.png

9.2.2.4. 参考:ロウソクチャートを描く#

Note

以下cufflinksというパッケージを用いて可視化を行います。 requirements.txtを用いて必要なパッケージをすでにインストールしている場合は、次のセルに進んでください。インストール方法はこちら必要なパッケージをご確認ください。

まだの場合などは、pip install cufflinksをターミナルで実行しcufflinksをインストールてから次のセルを実行してください。

import cufflinks as cf
cf.set_config_file(offline=True)

qf = cf.QuantFig(AAPL, legend='top', title = 'Apple Candle Chart')
qf.iplot()
qf = cf.QuantFig(AAPL, legend='top', title = 'Apple Candle Chart')
qf.add_volume()  # 出来高もプロット
qf.add_sma([10,50],width=2, color=['red', 'green'])  # 移動平均線もプロット
qf.iplot()
/opt/anaconda3/lib/python3.11/site-packages/cufflinks/quant_figure.py:1061: FutureWarning:

Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`