{ "cells": [ { "cell_type": "markdown", "metadata": { "pycharm": {} }, "source": [ "# Obtaining Historical Stock Data\n", "\n", "Keywords: stock price data\n", "\n", "The purpose of this notebook is to download historical trading data for a selected group of the stocks from Alpha Vantage for use with other notebooks. Use of this notebook requires you so enter your personal Alpha Vantage api key into a file `data/api_key.txt`. The trading data is stored as individual `.csv` files in a designated directory. Subsequent notebooks read and consolidate that data into a singe file. \n", "\n", "You only need to run this notebook if you wish to analyze a different set of stocks, if you wish to update data for the existing set." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Imports" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import os\n", "import pandas as pd\n", "import requests\n", "import time" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Select Stocks to Download" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "pycharm": {} }, "outputs": [], "source": [ "djia = ['AXP','BA','CAT','CSCO','CVX','DD','DIS','GE', \\\n", " 'GS','HD','IBM','INTC','JNJ','JPM','KO','MCD', \\\n", " 'MMM','MRK','MSFT','NKE','PFE','PG','T','TRV', \\\n", " 'UNH','UTX','V','VZ','WMT','XOM']\n", "\n", "favs = ['AAPL']\n", "\n", "stocks = favs + djia\n", "\n", "data_dir = os.path.join('data', 'stocks')\n", "os.makedirs(data_dir, exist_ok=True)" ] }, { "cell_type": "markdown", "metadata": { "pycharm": {} }, "source": [ "## Alpha Vantage\n", "\n", "The following cells retrieve a history of daily trading data for a specified set of stock ticker symbols. These functions use the free [Alpha Vantage](https://www.alphavantage.co/) data service. The free service tier provides up to 5 queries per minute.\n", "\n", "The service requires an personal api key which can be claimed [here](https://www.alphavantage.co/support/#api-key) in just a few seconds. Place the key as a string in a file `data/api_key.txt` in the data directory as this notebook (note: api_key.txt is not distributed with the github repository). The function `api_key()` returns the key stored in `api_key.txt`." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "pycharm": {} }, "outputs": [], "source": [ "def api_key():\n", " \"Read api_key.txt and return api_key\"\n", " try:\n", " with open('data/api_key.txt') as fp:\n", " line = fp.readline()\n", " except:\n", " raise RuntimeError('Error while attempting to read data/api_key.txt')\n", " return line.strip()" ] }, { "cell_type": "markdown", "metadata": { "pycharm": {} }, "source": [ "The function `alphavantage(s)` returns a pandas dataframe holding historical trading data for a stocker ticker symbol specified by `s`." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "pycharm": {} }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
openhighlowcloseadjusted closevolumedividend amountsplit coefficient
2022-11-07137.110139.145135.671138.92138.920000833746280.001.0
2022-11-04142.090142.670134.380138.38138.3800001408147960.231.0
2022-11-03142.060142.800138.750138.88138.649552979185160.001.0
2022-11-02148.945152.170145.000145.03144.789347936046230.001.0
2022-11-01155.080155.450149.130150.65150.400022803793450.001.0
\n", "
" ], "text/plain": [ " open high low close adjusted close volume \\\n", "2022-11-07 137.110 139.145 135.671 138.92 138.920000 83374628 \n", "2022-11-04 142.090 142.670 134.380 138.38 138.380000 140814796 \n", "2022-11-03 142.060 142.800 138.750 138.88 138.649552 97918516 \n", "2022-11-02 148.945 152.170 145.000 145.03 144.789347 93604623 \n", "2022-11-01 155.080 155.450 149.130 150.65 150.400022 80379345 \n", "\n", " dividend amount split coefficient \n", "2022-11-07 0.00 1.0 \n", "2022-11-04 0.23 1.0 \n", "2022-11-03 0.00 1.0 \n", "2022-11-02 0.00 1.0 \n", "2022-11-01 0.00 1.0 " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import os\n", "import requests\n", "import pandas as pd\n", "\n", "def alphavantage(symbol=None):\n", " if symbol is None: \n", " raise ValueError(\"No symbol has been provided\")\n", " payload = {\n", " \"function\": \"TIME_SERIES_DAILY_ADJUSTED\",\n", " \"symbol\": symbol,\n", " \"outputsize\": \"full\",\n", " \"datatype\": \"json\",\n", " \"apikey\": api_key(), \n", " }\n", " api_url = \"https://www.alphavantage.co/query\"\n", " try:\n", " response = requests.get(api_url, params=payload)\n", " except:\n", " raise ValueError(\"No response using api key: \" + api_key)\n", " data = response.json()\n", " k = list(data.keys())\n", " metadata = data[k[0]]\n", " timeseries = data[k[1]]\n", " S = pd.DataFrame.from_dict(timeseries).T\n", " S = S.apply(pd.to_numeric)\n", " S.columns = [h.lstrip('12345678. ') for h in S.columns]\n", " return S\n", "\n", "alphavantage('AAPL').head()" ] }, { "cell_type": "markdown", "metadata": { "pycharm": {} }, "source": [ "`get_stock_data(symbols)` retrieves trading data for a list of symbols and stores each in seperate file in the data directory. The file name is the ticker symbol with a `.csv` suffix." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "pycharm": {} }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "downloading AAPL success\n" ] } ], "source": [ "def get_stock_data(symbols, service=alphavantage):\n", " if isinstance(symbols, str):\n", " symbols = [symbols]\n", " assert all(isinstance(s, str) for s in symbols)\n", " for s in symbols:\n", " print('downloading', s, end='')\n", " k = 5\n", " while k > 0:\n", " try:\n", " k -= 1\n", " S = service(s)\n", " S.to_csv(os.path.join(data_dir, s + '.csv'))\n", " print(' success')\n", " break\n", " except:\n", " print('.', end='')\n", " time.sleep(12)\n", " if k < 0: print('fail')\n", " \n", "get_stock_data(['AAPL'])" ] }, { "cell_type": "markdown", "metadata": { "pycharm": {} }, "source": [ "## Download selected ticker Symbols" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "pycharm": {} }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "downloading AAPL success\n", "downloading AXP success\n", "downloading BA success\n", "downloading CAT.. success\n", "downloading CSCO success\n", "downloading CVX success\n", "downloading DD success\n", "downloading DIS success\n", "downloading GE.... success\n", "downloading GS success\n", "downloading HD success\n", "downloading IBM success\n", "downloading INTC success\n", "downloading JNJ.....downloading JPM success\n", "downloading KO success\n", "downloading MCD success\n", "downloading MMM success\n", "downloading MRK success\n", "downloading MSFT.... success\n", "downloading NKE success\n", "downloading PFE success\n", "downloading PG success\n", "downloading T success\n", "downloading TRV.....downloading UNH success\n", "downloading UTX success\n", "downloading V success\n", "downloading VZ success\n", "downloading WMT success\n", "downloading XOM.... success\n" ] } ], "source": [ "get_stock_data(stocks)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": {} }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" } }, "nbformat": 4, "nbformat_minor": 4 }