Predicting Stock Prices and Corporate Financials Ratios with Facebook Prophet Using Python

I started off wanting to analyze the sustainability of mortgage buydowns by home builders. But as it happens, my ADHD had other plans. Three weeks later, here we are with code that looks for trends and anomalies in corporate financials. The code generates detailed HTML profiles for in-depth financial analysis of stocks.

In a series of previous posts, I’ve explained the extraction of corporate financial data and stock price data. Now, let’s go a step further by utilizing Plotly and Facebook’s Prophet library to predict these financial metrics and stock prices. It’s crucial to note that the goal here isn’t to necessarily predict future values but to model the data in such a way that we can quickly spot any anomalies. In this tutorial, we’ll be extracting data from HDF5 files, which were created in previous posts, so make sure to check those out first.

Generated HTML Output: A Preview of What the Code Produces

The code generates an in-depth profile of any stock specified. In this example, we take a look at Google. The output is a structured HTML file that includes a variety of information sections. First, it provides basic details such as the company’s name, ticker symbol, and the date of the most recent update. Next, it lists essential contact information, including the company’s web URL, phone number, and physical address. The sector and industry in which the company operates, as well as the number of full-time employees, are also displayed. The output then delves into valuation metrics like P/E ratios, enterprise value, and other financial ratios. Finally, it highlights key financial data points, such as market capitalization, EBITDA, and various earnings and revenue estimates. Overall, the code produces a multi-faceted profile that serves as a valuable resource for anyone looking to understand Alphabet Inc’s business and financial standing in great detail.

Forecasting Future Adjusted Close Prices with Facebook Prophet: A One-Year Outlook

This plot visualizes the historical and predicted adjusted close prices for a particular asset, utilizing the Facebook Prophet algorithm for the forecast. The x-axis represents the timeline, extending from the earliest available historical data to one year into the future. The y-axis shows the adjusted close prices. Historical prices are plotted as solid purple lines. The predicted value is a solid green line with blue dotted lines as the high and low predictions. This predictive model leverages the power of Facebook Prophet to analyze seasonal and trend components in the historical data, providing a one-year outlook on potential price movements.

Forecasting Future Financial Ratios with Facebook Prophet: A One-Year Outlook

Financial ratios are key indicators of a company’s financial health and performance. They offer crucial insights into various aspects like profitability, liquidity, and valuation. In this section, we’ll leverage the predictive power of Facebook Prophet to forecast these ratios over the next year.

Forecasting Future Earnings and Profitability Metrics from the Income Statement with Facebook Prophet: A One-Year Outlook

These plots present a comprehensive visualization of key earnings and profitability metrics extracted from the income statement, including Net Income, Operating Income, Total Revenue, Gross Profit, and Free Cash Flow. The financial numbers are then forward projected using Facebook Prophet. These predictions not only display the expected trends for the next year but also encompass upper and lower bounds, providing a range of possible outcomes. This integration of historical data with advanced forecasting techniques offers a nuanced understanding of the company’s financial trajectory.

Forecasting Future Balance Sheet Metrics with Facebook Prophet: A One-Year Outlook

These plots offer an in-depth analysis of crucial balance sheet metrics such as Total Assets, Total Stockholder Equity, Retained Earnings, Long-Term Debt, and Total Liabilities. Leveraging the predictive power of Facebook Prophet, these key financial indicators are forecasted into the future. The projections not only illustrate the expected financial posture for the forthcoming year but also include upper and lower prediction intervals, giving a full spectrum of potential financial scenarios. By melding past balance sheet data with sophisticated predictive modeling, the plots provide a multi-dimensional view of the company’s expected financial stability and risk profile.

Forecasting Future Cash Flow Metrics with Facebook Prophet: A One-Year Outlook

These plots dive into essential cash flow metrics, specifically Total Cash from Operating Activities, Capital Expenditures, and Dividends Paid. These key figures are also extended into the future using Facebook Prophet’s forecasting capabilities. The resulting predictions not only map out the anticipated cash flow movements for the next year but are also bracketed by upper and lower confidence intervals, presenting a comprehensive range of financial possibilities. These plots offer a well-rounded perspective on the company’s future liquidity and capital allocation strategies.

Other Financial Metrics

In addition to the key financial metrics, the analysis delves into a diverse set of other metrics, plotting each meticulously. These metrics are categorized as follows:

Efficiency and Activity Metrics from Income Statement:

  • Research Development
  • Selling General Administrative


Stockholder’s Equity and Capital Structure:

  • Common Stock Shares Outstanding


Additional Important Metrics:

  • Net Debt
  • Net Receivables
  • Inventory
  • Accounts Payable
  • Total Current Assets
  • Total Current Liabilities

Less Critical Metrics:

  • Income Before Tax
  • Cost of Revenue
  • Intangible Assets
  • Earning Assets
  • Other Current Assets
  • Deferred Long-Term Liabilities
  • Other Current Liabilities
  • Common Stock
  • Capital Stock
  • Other Liabilities
  • Goodwill
  • Other Assets
  • Cash
  • Cash and Equivalents
  • Current Deferred Revenue
  • Short-Term Debt
  • Short/Long-Term Debt
  • Short/Long-Term Debt Total
  • Other Stockholder Equity
  • Property Plant Equipment
  • Long-Term Investments
  • Net Tangible Assets
  • Short-Term Investments

Other Metrics:

  • Effect of Accounting Charges
  • Income Tax Expense
  • Non-Operating Income Net Other
  • Selling and Marketing Expenses
  • Common Stock Total Equity
  • Preferred Stock Total Equity
  • Retained Earnings Total Equity
  • Treasury Stock
  • Accumulated Amortization
  • Non-Current Assets Other
  • Deferred Long-Term Asset Charges
  • Non-Current Assets Total
  • Capital Lease Obligations
  • Long-Term Debt Total
  • Non-Current Liabilities Other
  • Non-Current Liabilities Total
  • Negative Goodwill
  • Warrants
  • Preferred Stock Redeemable
  • Capital Surpluse
  • Liabilities And Stockholders Equity
  • Cash And Short-Term Investments
  • Property Plant And Equipment Gross
  • Property Plant And Equipment Net
  • Accumulated Depreciation
  • Total Cash Flows From Investing Activities
  • Total Cash From Financing Activities
  • Net Borrowings
  • Issuance Of Capital Stock
  • Investments
  • Change To Liabilities
  • Change To Operating Activities
  • Change In Cash
  • Begin Period Cash Flow
  • End Period Cash Flow
  • Depreciation
  • Other Cash Flows From Investing Activities
  • Change To Inventory
  • Change To Account Receivables
  • Sale Purchase Of Stock
  • Other Cash Flows From Financing Activities
  • Change To Net Income
  • Change Receivables
  • Cash Flows Other Operating
  • Exchange Rate Changes
  • Cash And Cash Equivalents Changes
  • Change In Working Capital
  • Stock Based Compensation
  • Other Non-Cash Items

Code Overview

Prerequisites

  • Python 3.x
  • Pandas
  • Plotly
  • Prophet
  • Logging
  • Pathlib
  • os
  • Collections
  • bs4 (BeautifulSoup)
  • time
  • scikit-learn

Helper Functions

The code includes several helper functions, such as:

  • calculate_mae: Calculates the Mean Absolute Error between the actual and predicted data.
  • access_hdf5_with_retries: Attempts to read an HDF5 file, retrying up to a specified number of times.
  • read_general_info: Reads general company information from an HDF5 file.
  • infer_dtype: Infers the data type of a Pandas Series.
  • fetch_data_for_symbol_from_multiple_h5: Fetches data for a specific symbol from multiple HDF5 files.
  • get_company_name: Gets the company name for a given symbol from an HDF5 file.
  • add_content_before_plot: Adds additional content before the plot in an HTML file.

Forecasting Function

The core of this code is the forecast_with_multiple_metrics function, which:

  1. Accepts a DataFrame of financial metrics.
  2. Performs time-series forecasting on each metric using Facebook’s Prophet.
  3. Plots the actual and forecasted metrics using Plotly.
def forecast_with_multiple_metrics(df: pd.DataFrame, periods: int = 4, save_dir: str = "plots/", use_all_data: bool = True, eod_price_data: str = 'eod_price_data.h5'):
    ...

Code

import pandas as pd
from pathlib import Path
import logging
import os
from plotly.subplots import make_subplots
from prophet import Prophet
import plotly.graph_objects as go
from collections import Counter
from pandas.api.types import is_numeric_dtype
from bs4 import BeautifulSoup
import time
from sklearn.metrics import mean_absolute_error
from math import log10


from dictionaries_and_lists import metric_definitions, reordered_columns, homebuilders, sp500, companies_with_treasuries, largest_banks, percentage_metrics

data_dir = '/home/shared/algos/data/'
plots_dir = '/home/shared/algos/eodhd_data/plots/'

logging.basicConfig(level=logging.DEBUG)
logging.basicConfig(level=logging.INFO)

# Configure Pandas to display all columns
pd.set_option('display.max_columns', None)
pd.set_option('display.expand_frame_repr', False)

symbol_exchange_map = {}


def wrap_text(text, max_length):
    """
    Wraps text to a new line at the nearest whitespace of the max_length.
    """
    wrapped_lines = []
    while len(text) > max_length:
        # Find nearest whitespace of the max_length
        split_index = text.rfind(' ', 0, max_length + 1)
        if split_index == -1:  # No whitespace found, force split
            split_index = max_length
        wrapped_lines.append(text[:split_index])
        text = text[split_index:].lstrip()
    wrapped_lines.append(text)
    return '<br>'.join(wrapped_lines)

def format_large_number(num):
    print('Attempting to plot large number')
    if num < 1_000:
        return str(num)
    magnitude = int(log10(num) // 3)
    value = num / (10 ** (3 * magnitude))
    return f"{value:.2f}{' KMBT'[magnitude]}"

def format_percentage(data):
    try:
        # Remove any non-numeric characters like commas and percentage signs
        if isinstance(data, str):
            data = data.replace(',', '').replace('%', '').strip()
        # Convert to float and format as a percentage
        return "{:.2%}".format(float(data))
    except ValueError as e:
        print(f"ValueError: Could not convert {data} to a percentage.")
        return data  # Return the original data if it cannot be converted


def calculate_mae(actual, predicted):
    return mean_absolute_error(actual, predicted)

def access_hdf5_with_retries(hdf5_file_path, mode, max_retries=3, sleep_duration=5):
    retries = 0
    while retries < max_retries:
        try:
            with pd.HDFStore(hdf5_file_path, mode) as store:
                return store.keys()  # You can customize this part to return what you need
            break  # If successful, exit the while loop
        except Exception as e:  # Replace Exception with a more specific exception if possible
            if retries < max_retries - 1:
                logging.info(f"An exception occurred while reading {hdf5_file_path}: {e}")
                logging.info(f"Retrying in {sleep_duration} seconds...")
                time.sleep(sleep_duration)
                retries += 1
            else:
                logging.error(f"Max retries reached. Could not read {hdf5_file_path}. Exiting...")
                raise

def read_general_info(symbol, h5_general_path, max_retries=3, sleep_duration=5):
    retries = 0
    while retries < max_retries:
        try:
            with pd.HDFStore(h5_general_path, 'r') as store:
                key = f'/{symbol}'
                if key in store.keys():
                    general_info = store.get(key)
                    info_dict = dict(zip(general_info['SubCategory'], general_info['Data']))
                    sector = info_dict.get('Sector', 'N/A')
                    industry = info_dict.get('Industry', 'N/A')
                    description = info_dict.get('Description', 'N/A')
                    full_time_employees = info_dict.get('FullTimeEmployees', 'N/A')
                    updated_at = info_dict.get('UpdatedAt', 'N/A')
                    web_url = info_dict.get('WebURL', 'N/A')
                    phone = info_dict.get('Phone', 'N/A')
                    address = info_dict.get('Address', 'N/A')
                    name = info_dict.get('Name', 'N/A')
                    exchange = info_dict.get('Exchange', 'N/A')
                    return sector, industry, description, full_time_employees, updated_at, web_url, phone, address, name, exchange
                else:
                    return ['N/A']*10
            break
        except Exception as e:
            if retries < max_retries - 1:
                print(f"An exception occurred while reading {h5_general_path}: {e}")
                print(f"Retrying in {sleep_duration} seconds...")
                time.sleep(sleep_duration)
                retries += 1
            else:
                print(f"Max retries reached. Could not read {h5_general_path}. Exiting...")
                raise



def infer_dtype(series):
    sample = series.dropna().head(100)  # Sample the first 100 non-null rows

    # Debugging line
    print(f"Sample for {series.name}: {sample}")

    if sample.empty:
        return None

    if all(sample.apply(lambda x: isinstance(x, str))):
        return 'object'

    if is_numeric_dtype(sample) and all(sample.apply(lambda x: x == int(x))):
        return 'int64'

    if is_numeric_dtype(sample):
        return 'float64'
    return None


def fetch_data_for_symbol_from_multiple_h5(h5_filepaths, symbol, max_retries=3, sleep_duration=5):
    combined_data = None
    common_keys = ['Symbol', 'date', 'filing_date', 'currency_symbol']

    for h5_filepath in h5_filepaths:
        h5_path = Path(h5_filepath)
        if not h5_path.exists():
            logging.info(f"The file {h5_filepath} does not exist.")
            continue

        retries = 0
        while retries < max_retries:
            try:
                with pd.HDFStore(h5_filepath, 'r') as store:
                    if f"/{symbol}" in store.keys():
                        logging.info(f"Symbol {symbol} found in {h5_filepath}.")
                        symbol_data = store.select(symbol)

                        if 'date' not in symbol_data.columns and 'Date' not in symbol_data.columns:
                            if isinstance(symbol_data.index, pd.DatetimeIndex):
                                symbol_data.reset_index(inplace=True)
                                symbol_data.rename(columns={'index': 'date'}, inplace=True)
                                logging.info(
                                    f"'date' and 'Date' columns not found, but datetime index exists in {h5_filepath} for symbol {symbol}.")
                                symbol_data = symbol_data.reset_index().rename(columns={'index': 'date'})
                            else:
                                logging.warning(
                                    f"'date' and 'Date' columns and datetime index not found in {h5_filepath} for symbol {symbol}.")
                                break
                        else:
                            # Ensure the date column is standardized to 'date'
                            if 'Date' in symbol_data.columns:
                                symbol_data.rename(columns={'Date': 'date'}, inplace=True)


                        # Rename 'netIncome' if it's present in the dataframe
                        if 'netIncome' in symbol_data.columns:
                            # Extract the type of financial statement from the filename
                            parts = h5_path.stem.split('_')
                            statement_type = parts[1]
                            metric_name = parts[2]

                            statement_readable = {
                                "Cash": "Cash Flow from Operating Activities",
                                "Income": "netIncome",
                            }.get(statement_type, statement_type)

                            metric_title = f"{metric_name} ({statement_readable})"

                            # Ensure statement_type is one of the reports where 'netIncome' appears before renaming
                            if statement_type in ["Income", "Cash"]:
                                symbol_data.rename(columns={'netIncome': metric_title}, inplace=True)

                        if combined_data is None:
                            combined_data = symbol_data
                        else:
                            # Dynamically adjust common keys based on available columns
                            keys_for_merge = [key for key in common_keys if
                                              key in symbol_data.columns and key in combined_data.columns]
                            combined_data = pd.merge(combined_data, symbol_data, on=keys_for_merge, how='outer')
                        break

                    else:
                        logging.info(f"Symbol {symbol} not found in {h5_filepath}.")
                        break

            except Exception as e:
                if retries < max_retries - 1:
                    logging.info(f"An exception occurred while reading {h5_filepath}: {e}")
                    logging.info(f"Retrying in {sleep_duration} seconds...")
                    time.sleep(sleep_duration)
                    retries += 1
                else:
                    logging.info(f"Max retries reached. Could not read {h5_filepath}. Exiting...")
                    raise

    if combined_data is None:
        logging.info(f"Symbol {symbol} not found in any of the HDF5 files.")
        return None
    else:
        for col in combined_data.select_dtypes(include=['object']).columns:
            combined_data[col] = pd.to_numeric(combined_data[col], errors='ignore')

        if 'date' in combined_data.columns:
            combined_data.sort_values(by='date', inplace=True)
        else:
            logging.info(f"'date' column not found in combined data for symbol {symbol}. Sorting by index instead.")
            combined_data.sort_index(inplace=True)
        logging.info(f"{symbol}: successfully combined all HDF5 files.")
        return combined_data



def get_company_name(symbol, h5_file_path, country_code='US'):
    try:
        # Read the DataFrame for the entire country from the HDF5 file
        symbols_df = pd.read_hdf(h5_file_path, key=f'/{country_code}')

        # Filter by the specific symbol to get the company name
        company_name_row = symbols_df[symbols_df['Code'] == symbol]
        if not company_name_row.empty:
            return company_name_row['Name'].iloc[0]
        else:
            logging.error(f"No data for symbol {symbol} in the DataFrame.")
            return "Unknown"
    except KeyError:
        logging.error(f"No object named {country_code} in the file {h5_file_path}")
        return "Unknown"


def add_content_before_plot(symbol, max_retries=3, sleep_duration=5):
    target_filename = f"{symbol}_quarterly.html"
    html_file_path = os.path.join(plots_dir, target_filename)

    if not os.path.exists(html_file_path):
        print(f"HTML file for symbol {symbol} not found.")
        return

    # Fetch additional info from General.h5
    exchange = symbol_exchange_map.get(symbol, 'Other')  # Default to 'Other' if exchange is not found
    h5_general_path = os.path.join(data_dir, f"{exchange}_General.h5")

    sector, industry, description, full_time_employees, updated_at, web_url, phone, address, name, exchange = read_general_info(symbol, h5_general_path, max_retries, sleep_duration)

    # Prepare new HTML content
    new_html_content = f"""
    <h1>{symbol} - {name} - {exchange}</h1>
    <p>Updated At: {updated_at}</p>
    <p><strong style='font-size: larger;'>WebURL:</strong> <a href='{web_url}' target='_blank'>{web_url}</a>
    <strong style='font-size: larger;'>Phone:</strong> {phone}
    <strong style='font-size: larger;'>Address:</strong> {address}<br>
    <strong style='font-size: larger;'>Sector:</strong> {sector}
    <strong style='font-size: larger;'>Industry:</strong> {industry}<br>
    <strong style='font-size: larger;'>Full Time Employees:</strong> {full_time_employees}</p><br>
    <strong style='font-size: larger;'>Description:</strong> {description}<br><br>
    """

    # Read valuation.h5 data for the symbol
    h5_valuation_path = os.path.join(data_dir, f"{exchange}_Valuation.h5")

    retries = 0
    valuation_data = None
    while retries < max_retries:
        try:
            with pd.HDFStore(h5_valuation_path, 'r') as store:
                if symbol in store:
                    valuation_data = store.get(symbol)
                else:
                    valuation_data = pd.DataFrame(columns=['SubCategory', 'Data'])
                    print(f"The symbol {symbol} valuation data does not exist in {h5_valuation_path}")
            break
        except Exception as e:
            if retries < max_retries - 1:
                print(f"An exception occurred while reading {h5_valuation_path}: {e}")
                print(f"Retrying in {sleep_duration} seconds...")
                time.sleep(sleep_duration)
                retries += 1
            else:
                print(f"Max retries reached. Could not read {h5_valuation_path}. Exiting...")
                raise

    # Add valuation_data to new_html_content
    valuation_html = "<h2>Valuation</h2><div style='display: flex; flex-wrap: wrap;'>"
    if valuation_data is not None:
        print(f"valuation_data: {valuation_data}")
        valuation_data['Data'] = pd.to_numeric(valuation_data['Data'], errors='coerce')

        for index, row in valuation_data.iterrows():
            data = row['Data']
            if pd.notnull(data):  # Check if 'Data' is not NaN
                if index in percentage_metrics:
                    formatted_data = "{:.2%}".format(float(data))
                elif isinstance(data, (int, float)) and abs(data) >= 1_000:  # Large numbers
                    formatted_data = format_large_number(data)
                elif isinstance(data, (int, float)):
                    print('formatting numbers with commas in Valuation')
                    print(f"data: {data}, type: {type(data)}")
                    formatted_data = f"{data:,.2f}"  # For other numbers, just format with commas
                else:
                    formatted_data = data  # For non-numeric data, leave as is
            else:
                # If data is NaN or non-numeric, leave as is
                formatted_data = row['Data']

            valuation_html += f"<div style='flex: 0 0 calc(33.333% - 10px); margin-right: 10px; margin-bottom: 10px;'>"
            valuation_html += f"<strong>{row['SubCategory']}</strong>: {formatted_data}</div>"
    valuation_html += "</div>"

    new_html_content += valuation_html  # Append valuation data to new_html_content



    # Read highlights.h5 data for the symbol
    h5_highlights_path = os.path.join(data_dir, f"{exchange}_Highlights.h5")
    retries = 0
    while retries < max_retries:
        try:
            with pd.HDFStore(h5_highlights_path, 'r') as store:
                if symbol in store:
                    highlights_data = store.get(symbol)
                else:
                    highlights_data = pd.DataFrame(columns=['SubCategory', 'Data'])
                    print(f"The symbol {symbol} highlights data does not exist in {h5_highlights_path}")
            break
        except Exception as e:
            if retries < max_retries - 1:
                print(f"An exception occurred while reading {h5_highlights_path}: {e}")
                print(f"Retrying in {sleep_duration} seconds...")
                time.sleep(sleep_duration)
                retries += 1
            else:
                print(f"Max retries reached. Could not read {h5_highlights_path}. Exiting...")
                raise

    # Add title before highlights
    new_html_content += "<h2>Highlights</h2>"
    # Add highlights_data to new_html_content
    highlights_html = "<div style='display: flex; flex-wrap: wrap;'>"

    highlights_data['Data'] = highlights_data['Data'].apply(pd.to_numeric, errors='ignore')

    for index, row in highlights_data.iterrows():
        subcategory = row['SubCategory']
        data = row['Data']

        if subcategory in percentage_metrics:
            formatted_data = format_percentage(data)
        elif pd.notnull(data) and isinstance(data, (int, float)):
            if abs(data) >= 1_000:  # Large numbers
                print(f"{symbol}: {subcategory} is a large number. Formatting {data} with commas...")
                formatted_data = format_large_number(data)
            else:  # Other numeric data that is not a large number
                print(f"{symbol}: {subcategory} is a numeric value. Formatting {data} with commas...")
                formatted_data = f"{data:,.2f}"
        else:
            print(f'{symbol}: {subcategory} is a non-numeric value. Data: {data}')
            formatted_data = data  # For non-numeric data, leave as is

        highlights_html += f"""<div style='flex: 0 0 calc(33.333% - 10px); margin-right: 10px; margin-bottom: 10px;'>
                                <strong>{subcategory}</strong>: {formatted_data}
                               </div>"""
    highlights_html += "</div>"
    new_html_content += highlights_html  # App

    # Add the disclaimer to the end of new_html_content
    disclaimer = "<br><br><p><strong>Disclaimer:</strong> The last data point has been excluded from the Prophet prediction.</p>"
    new_html_content += disclaimer


    with open(html_file_path, 'r', encoding='utf-8') as f:
        html_content = f.read()

    # Parse the HTML content with BeautifulSoup
    soup = BeautifulSoup(html_content, 'html.parser')

    # Find the div that contains the Plotly plot
    plot_div = soup.find('div', {'class': 'plotly-graph-div'})

    if plot_div:
        # Create a BeautifulSoup object from the new_content string
        new_content_soup = BeautifulSoup(new_html_content, 'html.parser')

        # Insert the new content before the Plotly plot
        plot_div.insert_before(new_content_soup)

        # Save the modified HTML back to disk
        with open(html_file_path, 'w', encoding='utf-8') as f:
            f.write(str(soup))
    else:
        print(f"Plotly plot not found in the HTML file {html_file_path}.")


def forecast_with_multiple_metrics(df: pd.DataFrame, periods: int = 4, save_dir: str = "plots/", use_all_data: bool = True, eod_price_data: str = 'eod_price_data.h5'):
    def plot_adjusted_close_from_h5(fig, symbol, hdf5_file_path, max_retries=3, sleep_duration=5):
        hdf5_file_path = data_dir + hdf5_file_path
        keys = access_hdf5_with_retries(hdf5_file_path, 'r', max_retries, sleep_duration)

        if f'/{symbol}' in keys:
            with pd.HDFStore(hdf5_file_path, 'r') as store:
                stock_data = store.get(symbol)

                # Add actual adjusted_close to the plot
                fig.add_trace(
                    go.Scatter(x=stock_data.index, y=stock_data['Adjusted_close'],
                               mode='lines+markers',
                               name='Actual Adjusted_close',
                               legendgroup='Actual',
                               line=dict(color='purple'),
                               showlegend=True),
                    row=1, col=1
                )

                prophet_df = stock_data.reset_index()[['Date', 'Adjusted_close']].rename(
                    columns={'Date': 'ds', 'Adjusted_close': 'y'})

                best_mae = float('inf')
                best_forecast = None
                best_mode = None

                for mode in ['additive', 'multiplicative']:
                    model = Prophet(
                        seasonality_mode=mode,
                        yearly_seasonality=True,
                        weekly_seasonality=False,
                        daily_seasonality=False)
                    model.fit(prophet_df)
                    future = model.make_future_dataframe(periods=365)  # 1-year prediction
                    forecast = model.predict(future)

                    mae = calculate_mae(prophet_df['y'], forecast.loc[:len(prophet_df) - 1, 'yhat'])

                    if mae < best_mae:
                        best_mae = mae
                        best_forecast = forecast
                        best_mode = mode

                # Add the best forecast to the plot
                fig.add_trace(
                    go.Scatter(x=best_forecast['ds'], y=best_forecast['yhat'],
                               mode='lines',
                               name='Adjusted_close Forecast',
                               legendgroup='Forecast',
                               line=dict(color='green'),
                               showlegend=True),
                    row=1, col=1
                )

                # Add yhat_upper and yhat_lower to the plot
                fig.add_trace(
                    go.Scatter(x=best_forecast['ds'], y=best_forecast['yhat_upper'],
                               mode='lines',
                               name='Upper Forecast',
                               legendgroup='Upper Forecast',
                               line=dict(color='blue', dash='dash'),
                               showlegend=True),
                    row=1, col=1
                )

                fig.add_trace(
                    go.Scatter(x=best_forecast['ds'], y=best_forecast['yhat_lower'],
                               mode='lines',
                               name='Lower Forecast',
                               legendgroup='Lower Forecast',
                               line=dict(color='blue', dash='dash'),
                               showlegend=True),
                    row=1, col=1
                )

        else:
            logging.warning(f"No Adjusted_close data found for symbol: {symbol}")

    pd.set_option('display.max_columns', None)
    logging.debug(df.columns.tolist())

    # Get the columns that are both in df.columns and reordered_columns
    common_columns = [col for col in reordered_columns if col in df.columns]

    # Check if common_columns is not empty
    if common_columns:
        # Reorder the DataFrame using the common columns
        df = df[common_columns + [col for col in df.columns if col not in common_columns]]
    else:
        logging.warning("No common columns between df and reordered_columns.")

    metrics = [col for col in df.columns if df[col].dtype in ['int64', 'float64']]

    logging.info(metrics)

    subplot_titles = []
    for metric in metrics:
        definition = metric_definitions.get(metric, 'No definition available')
        wrapped_definition = wrap_text(definition, 160)  # Adjust the max_length as needed
        title = f"<b>{metric}</b><br><span style='font-size: smaller;'>{wrapped_definition}</span>"
        subplot_titles.append(title)

    fig = make_subplots(rows=len(subplot_titles) + 1, cols=1, subplot_titles=['Adjusted_close'] + subplot_titles)

    symbol = df['Symbol'].iloc[0]

    plot_adjusted_close_from_h5(fig, symbol, eod_price_data)




    # Fetch the company_name
    h5_symbols = os.path.join(data_dir, "symbols.h5")
    company_name = get_company_name(symbol, h5_symbols)

    if not os.path.exists(save_dir):
        os.makedirs(save_dir)

    if not use_all_data:
        disclaimer = "Disclaimer: The last data point has been excluded from the Prophet prediction. \n "
        disclaimer += "There may not be enough data points for Prophet to make accurate predictions. \n"
        fig.add_annotation(
            dict(
                x=0,
                y=1.1,
                xref="paper",
                yref="paper",
                text=disclaimer,
                showarrow=False,
                font=dict(size=16)
            )
        )


    for i, metric in enumerate(metrics):
        row = i + 2

        plotting_df = df[['date', metric]].copy().dropna()
        prophet_df = plotting_df.copy()

        # Convert the specified metrics to percentages by multiplying by 100
        if metric in ['ROE', 'Earnings_Yield', 'Dividend_Yield']:
            prophet_df[metric] = prophet_df[metric] * 100  # Convert to percentage
            plotting_df[metric] = plotting_df[metric] * 100  # Convert to percentage


        if not use_all_data:
            prophet_df = prophet_df.iloc[:-1, :]

        prophet_df.rename(columns={'date': 'ds', metric: 'y'}, inplace=True)

        if prophet_df.shape[0] < 2:
            logging.warning(f"Skipping {metric} because it has less than 2 non-NaN rows.")
            continue

        best_mae = float('inf')
        best_forecast = None
        best_mode = None

        for mode in ['additive', 'multiplicative']:
            model = Prophet(
                seasonality_mode=mode,
                yearly_seasonality=True,
                weekly_seasonality=False,
                daily_seasonality=False)
            model.fit(prophet_df)
            future = model.make_future_dataframe(periods=periods, freq='Q')
            forecast = model.predict(future)

            mae = calculate_mae(prophet_df['y'], forecast.loc[:len(prophet_df) - 1, 'yhat'])

            if mae < best_mae:
                best_mae = mae
                best_forecast = forecast
                best_mode = mode

        logging.info(f"Best seasonality mode for {metric} is {best_mode} with MAE {best_mae}")


        hover_format = '%{x}: %{y:.2f}%' if metric in ['ROE', 'Earnings_Yield', 'Dividend_Yield'] else '%{x}: %{y:,}'

        fig.add_trace(
            go.Scatter(x=plotting_df['date'], y=plotting_df[metric],
                       mode='lines+markers',
                       name=f'Actual {metric}',
                       legendgroup='Actual',
                       line=dict(color='purple'),
                       showlegend=(i == 0),
                       hovertemplate=hover_format),
            row=row, col=1
        )

        # Plot the forecasted metric data
        fig.add_trace(
            go.Scatter(x=best_forecast['ds'],
                       y=best_forecast['yhat'],
                       mode='lines+markers',
                       name=f'Forecasted {metric}',
                       legendgroup='Forecast',
                       line=dict(color='green'),
                       showlegend=(i == 0),
                       hovertemplate=hover_format),
            row=row, col=1
        )

        # Plot the upper forecast interval
        fig.add_trace(
            go.Scatter(x=best_forecast['ds'], y=best_forecast['yhat_upper'],
                       mode='lines',
                       name=f'Upper Bound {metric}',
                       legendgroup='Upper Forecast',
                       line=dict(color='blue', dash='dash'),
                       showlegend=(i == 0),
                       hovertemplate=hover_format),
            row=row, col=1
        )

        # Plot the lower forecast interval
        fig.add_trace(
            go.Scatter(x=best_forecast['ds'], y=best_forecast['yhat_lower'],
                       mode='lines',
                       name=f'Lower Bound {metric}',
                       legendgroup='Lower Forecast',
                       line=dict(color='blue', dash='dash'),
                       showlegend=(i == 0),
                       hovertemplate=hover_format),
            row=row, col=1
        )


    fig.update_layout(
        height=350 * len(metrics),
        width=1800,
        title_font_size=16  # You can change this value as needed
    )

    plot_file_path = os.path.join(save_dir, f"{symbol}_quarterly.html")

    fig.write_html(plot_file_path)
    logging.info(f"Saved Prophet plot for {company_name} to {plot_file_path}")

    add_content_before_plot(symbol)

def get_symbols(h5_file_path, key='US'):
    """
    Open an HDF5 file and populate the global dictionary symbol_exchange_map
    where the symbol is the key and the exchange is the value.

    Parameters:
        h5_file_path (str): The path to the HDF5 file.
        key (str): The key to use when reading the HDF5 file. Default is 'US'.

    Returns:
        None
    """

    h5_file_path = Path(h5_file_path)

    # Check if the file exists
    if not h5_file_path.exists():
        logging.info(f"The file {h5_file_path} does not exist.")
        return

    try:
        # Read the DataFrame from the HDF5 file
        df = pd.read_hdf(h5_file_path, key=key)

        # Check if 'Code' and 'Exchange' columns exist
        if 'Code' not in df.columns or 'Exchange' not in df.columns:
            logging.info(f"The 'Code' or 'Exchange' column does not exist in the DataFrame.")
            return

        # Populate the global symbol_exchange_map
        global symbol_exchange_map
        symbol_exchange_map = dict(zip(df['Code'], df['Exchange']))
        return list(symbol_exchange_map.keys())
    except Exception as e:
        logging.error(f"An error occurred: {e}")
        return

symbols = get_symbols(data_dir + 'symbols.h5', key='US')

symbols = ['UBER', 'LYFT', 'WE', 'IEP', 'AAPL'] + sp500 + largest_banks + companies_with_treasuries + homebuilders
symbols = ['GDHG', 'IEP']
symbols = ['GOOG']



h5_files = set()  # Use a set to automatically handle duplicates

for symbol in symbols:
    exchange = symbol_exchange_map.get(symbol, 'Other')  # Default to 'Other' if not found
    current_h5_files = [
        f"{data_dir}/{exchange}_Cash_Flow_quarterly.h5",
        f"{data_dir}/{exchange}_Balance_Sheet_quarterly.h5",
        f"{data_dir}/{exchange}_Income_Statement_quarterly.h5",
        f"{data_dir}/{exchange}_Ratios.h5"
    ]
    h5_files.update(current_h5_files)  # Update the set with the new file paths

    logging.info(f'Processing {symbol}')
    try:
        # Pass only the HDF5 files corresponding to the current symbol's exchange
        df = fetch_data_for_symbol_from_multiple_h5(list(current_h5_files), symbol, max_retries=3, sleep_duration=5)


        if df is None:
            logging.warning(f"No data found for symbol {symbol}. Skipping to the next symbol.")
            continue  # Skip to the next iteration of the loop    forecast_with_multiple_metrics(df, periods=4, save_dir="plots/", use_all_data=False)

        logging.info(df.head(20))
        forecast_with_multiple_metrics(df)
    except Exception as e:
        logging.error(f"An unexpected error occurred while processing symbol {symbol}: {e}")

What is an HDF5 File? A Deep Dive into Hierarchical Data Format

Introduction

If you’ve spent any time working with large datasets, especially in scientific computing, machine learning, or finance, you’ve probably come across HDF5 files. But what are they, and why are they so popular for storing complex data structures? In this blog post, we’ll explore the HDF5 file format, its architecture, and its various use-cases.

What is HDF5?

HDF5 stands for Hierarchical Data Format version 5. It is a file format that allows for a flexible and efficient way of storing complex data relationships, including metadata. Developed by the HDF Group, it is designed to store and organize large amounts of data, and to facilitate the sharing of that data with others.

Key Features

Hierarchical Structure

The “Hierarchical” in HDF5 refers to its tree-like structure, much like a file system. This allows for a more organized way to store datasets and metadata. In an HDF5 file, you can have groups that contain other groups or datasets, akin to folders and files in a computer’s file system.

High Performance

HDF5 is designed with performance in mind, allowing for fast read and write operations. This is crucial when working with large datasets that need to be accessed or modified frequently.

Extensible

You can add new data to an existing HDF5 file without disturbing its structure, making it a highly extensible format. This is particularly useful in scientific research and other evolving projects.

Portability

HDF5 files are easily shareable and are compatible across various platforms and languages. Libraries for interacting with HDF5 files are available in languages like Python, C, C++, and Java, among others.

Compression

HDF5 files support on-the-fly compression, saving valuable disk space. This is particularly useful when dealing with very large datasets.

Use Cases

Scientific Computing

In fields like physics, astronomy, and bioinformatics, HDF5 is often the go-to solution for handling complex data relationships and large datasets.

Financial Data

In finance, HDF5 files are commonly used to store time-series data, like stock prices, and complex financial models, making it easier for data analysts and algorithmic traders to backtest strategies.

Machine Learning

When dealing with large and complex datasets for training machine learning models, the efficiency and flexibility of HDF5 make it an ideal choice.

How to Interact with HDF5 Files

Python

In Python, one of the most commonly used libraries for interacting with HDF5 files is h5py. Here’s a quick example of how to create an HDF5 file and add a dataset:

import h5py

with h5py.File('example.h5', 'w') as f:
    dset = f.create_dataset("my_dataset", (100,), dtype='i')

Conclusion

HDF5 files offer a robust, efficient, and flexible way to store complex data structures. Whether you are a researcher, a data scientist, or a financial analyst, understanding how to use HDF5 files can be a significant advantage in your work.

So the next time you need to store large amounts of structured data, consider using HDF5. It’s a powerful tool that can simplify your data management tasks and improve the efficiency of your data operations.

Using Python to save corporate financial data locally from EODHD

In a previous post, I showed how to store symbol data from EODHD. The purpose of this code is to now iterate through all those symbols and grab the corporate financial data from EODHD using their API. If you’re interested in downloading Open, High, Low, Close, Adjusted Cose, and Volume data you can find that in this blog post.

Output

Script Overview

The Python script is primarily designed to perform the following tasks:

  1. Fetch financial fundamentals for a list of stock symbols.
  2. Store this information in an HDF5 file for optimized data storage.
  3. Handle various categories of financial data including ‘General’, ‘Earnings’, ‘Financials’, and more.
  4. Log activities for better debugging and monitoring.

The script relies on the EOD Historical Data API and uses Python libraries like Pandas, Requests, and h5py.

Utility Functions

  • save_dataframe_to_h5(): Saves a Pandas DataFrame to an HDF5 file.
  • key_exists_in_h5(): Checks if a key exists in an HDF5 file.
  • print_all_keys_in_h5(): Prints all keys in an HDF5 file.
  • symbol_exists_in_h5(): Checks if a symbol exists in an HDF5 file.
  • convert_columns_to_numeric(): Converts DataFrame columns to numeric types, if applicable.
  • update_dataframe(): Updates the DataFrame with new rows of data.

Main Function: fetch_and_store_fundamentals()

This function performs the core operation of fetching and storing data. It takes an API token, a list of stock symbols, a data directory, and some optional parameters as arguments.

The function goes through the following steps for each symbol:

  1. Check If Data Exists: If the skip_existing flag is true, it checks whether the data already exists in the HDF5 storage.
  2. Fetch Data: Downloads JSON data for the stock symbol from the EOD Historical Data API.
  3. Data Processing: Processes different categories of data (General, Financials, Earnings, etc.) and stores them in separate HDF5 files.
  4. Log Update: Updates a log file with the timestamp of the last fetch for each symbol.

Helper Function: get_symbols()

This function populates a global dictionary, symbol_exchange_map, mapping stock symbols to their respective exchanges. It reads this information from an existing HDF5 file.

Code Execution

Finally, the script fetches a list of stock symbols using get_symbols() and then calls the fetch_and_store_fundamentals() function to perform the data fetching and storing.

Conclusion

Automating the process of financial data collection and storage is a crucial step in building robust trading algorithms or investment strategies. This script serves as a foundational block for such endeavors, allowing you to focus more on data analysis rather than data collection.

Code

from keys import api_token
import time
import h5py


data_dir = '/home/shared/algos/data/'
logs_dir = '/home/shared/algos/eodhd_data/logs/'

import logging
import pandas as pd
# Configure Pandas to display all columns
pd.set_option('display.max_columns', None)
import requests
import io
from pathlib import Path
from tqdm import tqdm
import json
import os
from datetime import datetime, timedelta

from dictionaries_and_lists import homebuilders, sp500, companies_with_treasuries, largest_banks


logging.basicConfig(level=logging.INFO)
pd.set_option('display.expand_frame_repr', False)


symbol_exchange_map = {}

def save_dataframe_to_h5(df, h5_path, key, drop_columns=None):
    convert_columns_to_numeric(df)  # Assuming this function converts numeric columns

    column_type_mapping = {
        'Symbol': 'object',
        'Country': 'object',
        'reportDate': 'datetime',
        'filing_date': 'datetime',
        'date': 'datetime',
        'Date': 'datetime',
        'currency_symbol': 'object',
        'currency': 'object',
        'beforeAfterMarket': 'object'
    }

    for column, dtype in column_type_mapping.items():
        if column in df.columns:
            if dtype == 'datetime':
                df[column] = pd.to_datetime(df[column], errors='coerce')
            else:
                df[column] = df[column].astype(dtype)

    if drop_columns:
        df.drop(columns=drop_columns, errors='ignore', inplace=True)

    try:
        df.to_hdf(h5_path, key=key, mode='a')
    except Exception as e:
        logging.error(f"Failed to save DataFrame to HDF5 file {h5_path} with key {key}. Error: {e}")

def key_exists_in_h5(h5_filename, key):
    with pd.HDFStore(h5_filename, 'r') as store:
        return key in store

def print_all_keys_in_h5(h5_filename):
    with pd.HDFStore(h5_filename, 'r') as store:
        print("Keys in HDF5 file:")
        for key in store.keys():
            print(key)

def symbol_exists_in_h5(h5_filepath, symbol):
    try:
        with pd.HDFStore(h5_filepath, 'r') as store:
            return f'/{symbol}' in store.keys()
    except HDF5Error:
        return False

def convert_columns_to_numeric(df):
    for col in df.columns:
        df[col] = pd.to_numeric(df[col], errors='ignore')


def update_dataframe(df, symbol, sub_category, sub_category_data, time_frame=None):
    row_data = {
        'Symbol': symbol,
        'SubCategory': sub_category,
    }

    if time_frame is not None:
        row_data['TimeFrame'] = time_frame

    if isinstance(sub_category_data, dict):
        row_data.update(sub_category_data)
    else:
        row_data['Data'] = str(sub_category_data)

    new_row = pd.DataFrame([row_data])
    df = pd.concat([df, new_row], ignore_index=True)
    return df


def fetch_and_store_fundamentals(api_token, symbols, data_dir, skip_existing=False, hours_to_skip=72):
    log_file = logs_dir + 'symbol_time_log.json'
    symbol_time_dict = {}

    try:
        with open(log_file, 'r') as f:
            content = f.read()
        symbol_time_dict = json.loads(content)
    except json.JSONDecodeError as e:
        logging.error(
            f"JSON Decode Error at line {e.lineno}, column {e.colno}. Content around error position: {content[e.pos - 10:e.pos + 10]}")
    except FileNotFoundError:
        logging.info(f"File {log_file} not found. An empty dictionary will be used.")
    except Exception as e:
        logging.error(f"An unexpected error occurred: {e}")


    for symbol in tqdm(symbols, desc='Fetching and storing fundamentals'):
        last_downloaded_time = symbol_time_dict.get(symbol, None)
        if last_downloaded_time:
            last_downloaded_time = datetime.fromisoformat(last_downloaded_time)
            time_since_last_download = datetime.now() - last_downloaded_time
            if time_since_last_download < timedelta(hours=hours_to_skip):
                logging.info(f"Data for symbol {symbol} was downloaded recently. Skipping...")
                continue

        if skip_existing:
            # Assuming the symbol is known at this point in your code
            exchange = symbol_exchange_map.get(symbol, '')  # Get exchange name from global dict

            # Create the new h5 path with the exchange name prepended
            h5_path_check = Path(data_dir, f"{exchange}_General.h5") if exchange else Path(data_dir, "General.h5")

            if h5_path_check.exists() and key_exists_in_h5(h5_path_check, f'/{symbol}'):
                logging.info(f"Data for symbol {symbol} already exists. Skipping...")
                continue

        logging.info(f"\n{symbol}: Downloading from EODHD...")

        try:
            url = f"https://eodhd.com/api/fundamentals/{symbol}.US?api_token={api_token}"
            response = requests.get(url)
        except ConnectionError:
            logging.error(f"ConnectionError occurred while fetching data for symbol {symbol}. Retrying in 60 seconds.")
            time.sleep(60)
            continue
        except ConnectionRefusedError:
            logging.error(
                f"ConnectionRefusedError occurred while fetching data for symbol {symbol}. Retrying in 60 seconds.")
            time.sleep(60)
            continue
        except Exception as e:
            logging.error(f"An unexpected error occurred: {e}")
            continue

        if response.status_code != 200:
            logging.error(f"Failed to fetch data for symbol {symbol}. HTTP Status Code: {response.status_code} \n Sleeping for 60 seconds")
            time.sleep(60)  # Sleep for 60 seconds
            continue  # Continue to next iteration

        json_data = response.json()

        logging.info(f"\n{symbol}: Finished downloading from EODHD...")

        # Check if the logging level is set to DEBUG
        if logging.getLogger().getEffectiveLevel() == logging.DEBUG:
            json_dump_file = logs_dir + 'api_response_output.txt'

            with open(json_dump_file, 'w') as f:
                f.write("JSON Data:\n")
                f.write(json.dumps(json_data, indent=4))

        for category, category_data in json_data.items():
            if category_data is None:
                logging.warning(f"Data for category {category} is None.")
                continue

            exchange = symbol_exchange_map.get(symbol, 'Other')
            h5_path = Path(data_dir, f"{exchange}_{category}.h5")
            df = pd.DataFrame()

            if category == 'ESGScores':
                continue

            elif category == 'Holders':
                for holder_type, holder_data in category_data.items():
                    h5_path = Path(data_dir, f"{exchange}_Holders_{holder_type}.h5")

                    for holder_id, holder_info in holder_data.items():
                        df = update_dataframe(df, symbol, holder_type, holder_info)
                    save_dataframe_to_h5(df, h5_path, key=symbol, drop_columns=['SubCategory'])
                    logging.info(f"{symbol} finished processing category Holders")

            elif category == 'SplitsDividends':
                logging.debug(f"Processing 'SplitsDividends' category. Data: {category_data}")

                # h5_path = Path(data_dir, f"{category}.h5")
                for sub_key, sub_data in category_data.items():
                    logging.debug(f"Processing key: {sub_key}")
                    logging.debug(f"Data for key {sub_key}: {sub_data}")

                    if sub_key == 'NumberDividendsByYear':
                        nested_h5_path = Path(data_dir, f"{exchange}_SplitsDividends_{sub_key}.h5")

                        nested_df = pd.DataFrame()

                        for item_key, item_data in sub_data.items():
                            logging.debug(f"Item data: {item_key}")
                            nested_df = update_dataframe(nested_df, symbol, sub_key, item_data)

                        # Sort and remove duplicates based on the Year column
                        # Before sorting, check if 'Year' exists
                        if 'Year' in nested_df.columns:
                            nested_df = nested_df.sort_values(by=['Year'])
                            nested_df.drop_duplicates(subset=['Year'], keep='last', inplace=True)

                        save_dataframe_to_h5(nested_df, nested_h5_path, key=symbol, drop_columns=['SubCategory'])
                    else:
                        df = update_dataframe(df, symbol, sub_key, sub_data)

                save_dataframe_to_h5(df, h5_path, key=symbol)
                logging.info(f"{symbol} finished processing category SplitsDividends")


            elif category == 'General':
                logging.debug(f"Processing 'General' category. Data: {category_data}")
                for sub_key, sub_data in category_data.items():
                    logging.debug(f"Processing key: {sub_key}")
                    logging.debug(f"Data for key {sub_key}: {sub_data}")

                    if sub_key in ['Listings', 'Officers']:
                        continue  # Skip 'Listings' and 'Officers'

                    df = update_dataframe(df, symbol, sub_key, sub_data)


                save_dataframe_to_h5(df, h5_path, key=symbol)
                logging.info(f"{symbol} finished processing category General")


            elif category == 'Financials':
                # Iterate through report types like 'Balance Sheet', 'Cash Flow', 'Income Statements'
                for report_type, report_data in category_data.items():
                    # Iterate through time frames like 'annual' or 'quarterly'
                    for time_frame, time_frame_data in report_data.items():
                        if time_frame == 'currency_symbol':
                            continue  # Skip the 'currency_symbol'

                        # Create a specific .h5 path for each combination of report_type and time_frame
                        h5_path = Path(data_dir, f"{exchange}_{report_type}_{time_frame}.h5")

                        df = pd.DataFrame()

                        # Update the DataFrame with financial data
                        for sub_category, sub_category_data in time_frame_data.items():
                            df = update_dataframe(df, symbol, sub_category, sub_category_data)

                        # Save this specific DataFrame to its respective .h5 file
                        save_dataframe_to_h5(df, h5_path, key=symbol, drop_columns=['SubCategory'])
                        logging.info(f"{symbol} finished processing category Financials")


            elif category == 'Earnings':
                for sub_category, sub_category_data in category_data.items():
                    # Create a specific .h5 path for each sub-category
                    h5_path = Path(data_dir, f"{exchange}_{category}_{sub_category}.h5")

                    df = pd.DataFrame()


                    for date_entry, date_entry_data in sub_category_data.items():  # Iterate through each date entry in the subcategory
                        if date_entry == 'currency_symbol':
                            continue

                        # Adding a date field to each entry
                        date_entry_data['Date'] = date_entry

                        # Update the dataframe with new row
                        df = update_dataframe(df, symbol, sub_category, date_entry_data)

                    # Save this specific DataFrame to its respective .h5 file
                    if 'SubCategory' in df.columns:
                        save_dataframe_to_h5(df, h5_path, key=symbol, drop_columns=['SubCategory'])  # Drop the 'SubCategory' column
                    else:
                        save_dataframe_to_h5(df, h5_path, key=symbol)
                    logging.info(f"{symbol} finished processing category Earnings")


            elif category == 'outstandingShares':
                for time_frame, time_frame_data in category_data.items():  # time_frame can be 'annual' or 'quarterly'
                    # Create a specific .h5 path for each time frame
                    h5_path = Path(data_dir, f"{exchange}_{category}_{time_frame}.h5")

                    df = pd.DataFrame()


                    for entry_id, entry_data in time_frame_data.items():
                        df = update_dataframe(df, symbol, time_frame, entry_data)

                    # Save this specific DataFrame to its respective .h5 file
                    save_dataframe_to_h5(df, h5_path, key=symbol, drop_columns=['SubCategory'])
                    logging.info(f"{symbol} finished processing category outstandingShares")


            elif category == 'ETF_Data':
                # DataFrame for the top-level ETF_Data, excluding subcategories that will be handled separately
                top_level_df = pd.DataFrame()

                # Dictionary to hold top-level data for this stock symbol
                top_level_data = {'Symbol': symbol}

                for sub_category, sub_category_data in category_data.items():
                    if sub_category in ['Asset_Allocation', 'World_Regions', 'Sector_Weights', 'Fixed_Income', 'Top_10_Holdings', 'Holdings', 'Valuations_Growth', 'Market_Capitalisation', 'MorningStar', 'Performance', ]:
                        # Create a specific .h5 path for each sub-category
                        h5_path = Path(data_dir, f"{exchange}_ETF_{sub_category}.h5")

                        df = pd.DataFrame()

                        # Update the DataFrame with sub-category data
                        for item_key, item_data in sub_category_data.items():
                            df = update_dataframe(df, symbol, item_key, item_data)

                        # Save this specific DataFrame to its respective .h5 file
                        save_dataframe_to_h5(df, h5_path, key=symbol)

                    else:
                        # Populate the top-level data dictionary
                        top_level_data[sub_category] = sub_category_data

                # Convert the top-level data dictionary to a DataFrame and append it to top_level_df
                new_row = pd.DataFrame([top_level_data])
                top_level_df = pd.concat([top_level_df, new_row], ignore_index=True)

                # Save the top-level ETF_Data DataFrame to its respective .h5 file
                top_level_h5_path = Path(data_dir, "ETF_Data.h5")
                save_dataframe_to_h5(top_level_df, top_level_h5_path, key=symbol)
                logging.info(f"{symbol} finished processing category ETF_Data")

            else:
                logging.debug(f'Processing other category {category}')

                if h5_path.exists() and symbol_exists_in_h5(h5_path, symbol):
                    df = pd.read_hdf(h5_path, key=symbol)
                else:
                    df = pd.DataFrame()

                if isinstance(category_data, dict):
                    for sub_category, sub_category_data in category_data.items():
                        df = update_dataframe(df, symbol, sub_category, sub_category_data)
                else:
                    df = update_dataframe(df, symbol, category, category_data)

                save_dataframe_to_h5(df, h5_path, key=symbol)
                logging.info(f"{symbol} finished processing category {category}")

        logging.info(f"{symbol} updating symbol status log file.")
        symbol_time_dict[symbol] = datetime.now().isoformat()
        with open(log_file, 'w') as f:
            json.dump(symbol_time_dict, f)

        logging.info(f"{symbol} finished processing")


def get_symbols(h5_file_path, key='US'):
    """
    Open an HDF5 file and populate the global dictionary symbol_exchange_map
    where the symbol is the key and the exchange is the value.

    Parameters:
        h5_file_path (str): The path to the HDF5 file.
        key (str): The key to use when reading the HDF5 file. Default is 'US'.

    Returns:
        None
    """

    h5_file_path = Path(h5_file_path)

    # Check if the file exists
    if not h5_file_path.exists():
        logging.info(f"The file {h5_file_path} does not exist.")
        return

    try:
        # Read the DataFrame from the HDF5 file
        df = pd.read_hdf(h5_file_path, key=key)

        # Check if 'Code' and 'Exchange' columns exist
        if 'Code' not in df.columns or 'Exchange' not in df.columns:
            logging.info(f"The 'Code' or 'Exchange' column does not exist in the DataFrame.")
            return

        # Populate the global symbol_exchange_map
        global symbol_exchange_map
        symbol_exchange_map = dict(zip(df['Code'], df['Exchange']))
        return list(symbol_exchange_map.keys())
    except Exception as e:
        logging.error(f"An error occurred: {e}")
        return


symbols = get_symbols(data_dir + 'symbols.h5', key='US')

fetch_and_store_fundamentals(api_token, symbols, data_dir, skip_existing=True, hours_to_skip=72)

How to Store Stock Market Data Locally with Python: A Step-by-Step Guide to Saving OHLCV Data from EODHD

In a previous post, I showed how to store symbol data from EODHD. The purpose of this code is to now iterate through all those symbols and grab the Open, High, Low, Close, Adjustable Close, and Volume data from EODHD using their API.

Output

This code will contain data for each stock symbol similar to this example for TSLA below for all 48,292 stocks.

INFO:root:Processing 48930 symbols
INFO:root:                Open      High       Low     Close  Adjusted_close     Volume
Date                                                                         
2010-06-29   18.9999   24.9999   17.5401   23.8899          1.5927  281749173
2010-06-30   25.7901   30.4191   23.3001   23.8299          1.5887  257915884
2010-07-01   24.9999   25.9200   20.2701   21.9600          1.4640  123447940
2010-07-02   23.0001   23.1000   18.7101   19.2000          1.2800   77127102
2010-07-06   20.0001   20.0001   15.8301   16.1100          1.0740  103189435
...              ...       ...       ...       ...             ...        ...
2023-10-17  250.1000  257.1830  247.0800  254.8500        254.8500   93562914
2023-10-18  252.7000  254.6300  242.0800  242.6800        242.6800  125147797
2023-10-19  225.9500  230.6100  216.7800  220.1100        220.1100  170772703
2023-10-20  217.0100  218.8606  210.4200  211.9900        211.9900  138010094
2023-10-23  210.0000  216.9800  202.5100  212.0800        212.0800  150215161

Overview of the Script


The script performs the following tasks:

  • Reads a list of stock symbols from an HDF5 file that we created previously which contains all the US stock symbols.
  • Fetches the EOD stock price data for each symbol via an API call.
  • Stores the fetched data in an HDF5 file.
  • Keeps track of the last downloaded time for each symbol in a JSON log file.
  • Allows for skipping previously downloaded data.
  • Plots a simple candlestick chart of the data.

Deep Dive into the Code

Let’s dissect the code step by step, explaining each function’s purpose and the underlying logic.

  1. Initial Setup

Before diving into the functions, you’ve set up some important libraries and configurations:

  • Logging: Helps in tracking the flow and potential issues in the code.
  • Data Directory: A designated path where you store or read data.
  • API Key: Securely imported from a separate keys.py file to maintain security and modularity.
  1. Function: get_symbols

This function is designed to fetch stock symbols from an existing HDF5 file.

It first checks if the file exists. If not, it logs an informative message.
Then, it attempts to read the HDF5 file into a DataFrame.
The function ensures that the ‘Code’ column exists before extracting the list of symbols.

  1. Function: fetch_eod_price_data

This is the core function of your script. Its main responsibilities include:

  • Fetching the EOD stock price data for each symbol.
  • Skipping data that has recently been downloaded, using logs.
  • Storing the fetched data in an HDF5 file.
  • The function maintains a log of the last downloaded time for each symbol. This helps in efficiently updating your data without redundant API calls.
  1. Main Execution

After defining the utility functions, the main execution of the script begins:

  • A list of stock symbols is fetched using the get_symbols function.
  • The fetch_eod_price_data function is called to update the EOD price data.
  1. Function: plot_candlestick_with_volume

This function is designed to visually represent the stock data:

  • It fetches the data for a particular stock from the HDF5 file.
  • Filters the data based on the provided start and end dates.
  • Plots a candlestick chart representing the stock’s price movements.
  • On the same chart, it overlays a volume bar chart to indicate trading volumes on respective dates.
  1. Visualization

Finally, a candlestick chart with volume bars for the stock ‘TSLA’ is plotted for a specific date range.

Conclusion

This script provides a streamlined approach to fetch, store, and visualize EOD stock price data. The modular design makes it easy to adapt and extend the functionalities as needed. Whether you’re a seasoned trader or just someone intrigued by stock market data, this script offers a solid foundation to build upon.

In future posts, we’ll use Facebook’s Prophet to forward forecast this data into the future.

Code

import pandas as pd
import requests
from io import StringIO
from pathlib import Path
import logging
import json
from datetime import datetime, timedelta
from tqdm import tqdm
import matplotlib.pyplot as plt
from mplfinance.original_flavor import candlestick_ohlc
import matplotlib.dates as mdates




logging.basicConfig(level=logging.DEBUG)

data_dir = '/home/shared/algos/data/'
log_file_path = "symbol_price_data_time_log.json"

from keys import api_token

def get_symbols(h5_file_path, key='US'):
    """
    Open an HDF5 file and return a list of symbols from the 'Code' column.

    Parameters:
        h5_file_path (str): The path to the HDF5 file.
        key (str): The key to use when reading the HDF5 file. Default is 'exchanges'.

    Returns:
        list: A list of symbols if successful; otherwise, an empty list.
    """

    h5_file_path = Path(h5_file_path)

    # Check if the file exists
    if not h5_file_path.exists():
        logging.info(f"The file {h5_file_path} does not exist.")
        return []

    try:
        # Read the DataFrame from the HDF5 file
        df = pd.read_hdf(h5_file_path, key=key)

        # Check if 'Code' column exists
        if 'Code' not in df.columns:
            logging.info(f"The 'Code' column does not exist in the DataFrame.")
            return []

        # Get the list of symbols from the 'Code' column
        symbols = df['Code'].tolist()

        return symbols

    except Exception as e:
        logging.error(f"An error occurred: {e}")
        return []


def fetch_eod_price_data(stock_symbols, data_dir, log_file, skip_existing=True, hours_to_skip=24):
    # Load existing logs if available
    try:
        with open(log_file, 'r') as f:
            symbol_time_dict = json.load(f)
    except FileNotFoundError:
        symbol_time_dict = {}

    all_data = {}
    count = 0

    # Create the directory if it doesn't exist
    Path(data_dir).mkdir(parents=True, exist_ok=True)
    hdf5_file_path = f"{data_dir}/eod_price_data.h5"

    existing_symbols = set()

    if skip_existing:
        with pd.HDFStore(hdf5_file_path, 'r') as store:
            existing_symbols = {key[1:] for key in store.keys()}


    for symbol in tqdm(stock_symbols, desc='Fetching and storing EOD price data'):
        last_downloaded_time = symbol_time_dict.get(symbol, None)

        if skip_existing and symbol in existing_symbols:
            continue

        if last_downloaded_time:
            last_downloaded_time = datetime.fromisoformat(last_downloaded_time)
            time_since_last_download = datetime.now() - last_downloaded_time
            if time_since_last_download < timedelta(hours=hours_to_skip):
                logging.info(f"Data for symbol {symbol} was downloaded recently. Skipping...")
                continue

        logging.info(f"{symbol}: Downloading from EODHD...")
        url = f'https://eodhd.com/api/eod/{symbol}.US?api_token={api_token}&fmt=csv'
        response = requests.get(url)
        if response.status_code != 200:
            logging.error(f"Failed to fetch data for symbol {symbol}. HTTP Status Code: {response.status_code}")
            continue

        csv_data = StringIO(response.text)
        df = pd.read_csv(csv_data)


        if 'Date' not in df.columns:
            logging.error(f"No 'Date' column in the data for symbol {symbol}. Skipping...")
            logging.error(f"Data received: {response.text}")
            continue

        # Set 'Date' as DateTime index
        df['Date'] = pd.to_datetime(df['Date'])
        df.set_index('Date', inplace=True)

        # Add the DataFrame to the all_data dictionary
        all_data[symbol] = df

        # Update log
        symbol_time_dict[symbol] = datetime.now().isoformat()
        with open(log_file, 'w') as f:
            json.dump(symbol_time_dict, f)

        count += 1

        # Update the HDF5 file every 100 symbols
        if count % 100 == 0:
            with pd.HDFStore(hdf5_file_path, mode='a') as store:
                for symbol, data in all_data.items():
                    store.put(symbol, data)
            logging.info(f"Saved {list(all_data.keys())} to HDF5 file.")
            all_data = {}

    # Save remaining data to HDF5
    if all_data:
        with pd.HDFStore(hdf5_file_path, mode='a') as store:
            for symbol, data in all_data.items():
                store.put(symbol, data)
        logging.info(f"Saved remaining {list(all_data.keys())} to HDF5 file.")

    logging.info("Finished updating HDF5 file.")



symbols = get_symbols(data_dir + 'symbols.h5', key='US')
logging.info(f'Processing {len(symbols)} symbols')


fetch_eod_price_data(symbols, data_dir, log_file_path, skip_existing=True, hours_to_skip=72)


symbol = 'TSLA'
hdf5_file_path = f"{data_dir}/eod_price_data.h5"


def plot_candlestick_with_volume(symbol, hdf5_file_path, start_date, end_date):
    with pd.HDFStore(hdf5_file_path, 'r') as store:
        if f'/{symbol}' in store.keys():
            df = store.get(symbol)
            logging.info("Data fetched successfully.")

            # Filter the DataFrame based on the start and end dates
            df = df[start_date:end_date]

            # Ensure the DataFrame is sorted by date
            df = df.sort_index()

            # Convert date to a format that can be used by candlestick_ohlc
            df['Date'] = df.index.map(mdates.date2num)

            # Create subplots: one for candlestick, one for volume
            fig, ax1 = plt.subplots(figsize=(10, 6))
            ax2 = ax1.twinx()

            # Plot candlestick chart
            ohlc = df[['Date', 'Open', 'High', 'Low', 'Adjusted_close']].copy()
            candlestick_ohlc(ax1, ohlc.values, width=0.6, colorup='green', colordown='red')
            ax1.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
            ax1.set_xlabel('Date')
            ax1.set_ylabel('Price')
            ax1.set_title(f'Candlestick chart for {symbol} ({start_date} to {end_date})')
            ax1.grid(True)

            # Plot volume
            ax2.bar(df.index, df['Volume'], color='gray', alpha=0.3)
            ax2.set_ylabel('Volume')

            plt.show()

        else:
            logging.info(f"No data found for symbol: {symbol}")


symbol = 'TSLA'
hdf5_file_path = f"{data_dir}/eod_price_data.h5"
start_date = '2023-01-01'
end_date = '2023-10-23'
plot_candlestick_with_volume(symbol, hdf5_file_path, start_date, end_date)

Storing Stock Market Indices and Symbols with EODHD

If you’re looking to perform complex analyses, backtests, or even develop your own trading algorithms you’re going to need access to good data. What I’m going to show you today is how to use EODHD to get a list of all the exchanges in their database. Then we will use all of the symbols from the exchange to download information about that stock symbol. In a later post, we will then use these symbols to retrieve all the corporate fundamental data for that stock.

This code will generate a dataframe of Exchanges you can see that EODHD offers data on 77 different exchanges.

Exchanges

                             Name    Code OperatingMIC         Country Currency CountryISO2 CountryISO3
0                      USA Stocks      US   XNAS, XNYS             USA      USD          US         USA
1                 London Exchange     LSE         XLON              UK      GBP          GB         GBR
2                    NEO Exchange     NEO         NEOE          Canada      CAD          CA         CAN
3            TSX Venture Exchange       V         XTSX          Canada      CAD          CA         CAN
4                Toronto Exchange      TO         XTSE          Canada      CAD          CA         CAN
5                 Berlin Exchange      BE         XBER         Germany      EUR          DE         DEU
6                Hamburg Exchange      HM         XHAM         Germany      EUR          DE         DEU
7                  XETRA Exchange   XETRA         XETR         Germany      EUR          DE         DEU
8             Dusseldorf Exchange      DU         XDUS         Germany      EUR          DE         DEU
9                 Munich Exchange      MU         XMUN         Germany      EUR          DE         DEU
10               Hanover Exchange      HA         XHAN         Germany      EUR          DE         DEU
11             Frankfurt Exchange       F         XFRA         Germany      EUR          DE         DEU
12             Stuttgart Exchange     STU         XSTU         Germany      EUR          DE         DEU
13                Vienna Exchange      VI         XWBO         Austria      EUR          AT         AUT
14      Luxembourg Stock Exchange      LU         XLUX      Luxembourg      EUR          LU         LUX
15                 Euronext Paris      PA         XPAR          France      EUR          FR         FRA
16              Euronext Brussels      BR         XBRU         Belgium      EUR          BE         BEL
17                Madrid Exchange      MC         BMEX           Spain      EUR          ES         ESP
18             SIX Swiss Exchange      SW         XSWX     Switzerland      CHF          CH         CHE
19                Euronext Lisbon      LS         XLIS        Portugal      EUR          PT         PRT
20             Euronext Amsterdam      AS         XAMS     Netherlands      EUR          NL         NLD
21             Stockholm Exchange      ST         XSTO          Sweden      SEK          SE         SWE
22                 Irish Exchange      IR         XDUB         Ireland      EUR          IE         IRL
23            Copenhagen Exchange      CO         XCSE         Denmark      DKK          DK         DNK
24            Oslo Stock Exchange      OL         XOSL          Norway      NOK          NO         NOR
25               Iceland Exchange      IC         XICE         Iceland      ISK          IS         ISL
26              Helsinki Exchange      HE         XHEL         Finland      EUR          FI         FIN
27          Malawi Stock Exchange     MSE         XMSW          Malawi      MWK          MW         MWI
28              Egyptian Exchange     EGX         XCAI           Egypt      EGP          EG         EGY
29       Botswana Stock Exchange     XBOT         XBOT        Botswana      BWP          BW         BWA
30           Ghana Stock Exchange     GSE         XGHA           Ghana      GHS          GH         GHA
31   Regional Securities Exchange    BRVM         XBRV     Ivory Coast      XOF          CI         CIV
32         Prague Stock Exchange       PR         XPRA  Czech Republic      CZK          CZ         CZE
33    Nairobi Securities Exchange    XNAI         XNAI           Kenya      KES          KE         KEN
34      Casablanca Stock Exchange      BC         XCAS         Morocco      MAD          MA         MAR
35    Stock Exchange of Mauritius     SEM         XMAU       Mauritius      MUR          MU         MUS
36        Nigerian Stock Exchange    XNSA         XNSA         Nigeria      NGN          NG         NGA
37          Rwanda Stock Exchange     RSE         RSEX          Rwanda     RWF           RW         RWA
38   Dar es Salaam Stock Exchange     DSE         XDAR        Tanzania      TZS          TZ         TZA
39     Uganda Securities Exchange     USE         XUGA          Uganda      UGX          UG         UGA
40          Lusaka Stock Exchange    LUSE         XLUS         Zambia       ZMW          ZM         ZMB
41              Tel Aviv Exchange      TA         XTAE          Israel      ILS          IL         ISR
42        Zimbabwe Stock Exchange    XZIM         XZIM       Zimbabwe       ZWL          ZW         ZWE
43  Victoria Falls Stock Exchange    VFEX         VFEX       Zimbabwe       ZWL          ZW         ZWE
44                         KOSDAQ      KQ         XKOS           Korea      KRW          KR         KOR
45           Korea Stock Exchange      KO         XKRX           Korea      KRW          KR         KOR
46        Budapest Stock Exchange     BUD         XBUD         Hungary      HUF          HU         HUN
47          Warsaw Stock Exchange     WAR         XWAR          Poland      PLN          PL         POL
48      Philippine Stock Exchange     PSE         XPHS     Philippines      PHP          PH         PHL
49         Chilean Stock Exchange      SN         XSGO           Chile      CLP          CL         CHL
50          Johannesburg Exchange     JSE         XJSE    South Africa      ZAR          ZA         ZAF
51               Jakarta Exchange      JK         XIDX       Indonesia      IDR          ID         IDN
52              Thailand Exchange      BK         XBKK        Thailand      THB          TH         THA
53              Shanghai Exchange     SHG         XSHG           China      CNY          CN         CHN
54                    NSE (India)     NSE         XNSE           India      INR          IN         IND
55                Athens Exchange      AT         ASEX          Greece      EUR          GR         GRC
56          Saudi Arabia Exchange      SR         XSAU    Saudi Arabia      SAR          SA         SAU
57              Shenzhen Exchange     SHE         XSHE           China      CNY          CN         CHN
58         Karachi Stock Exchange     KAR         XKAR        Pakistan      PKR          PK         PAK
59             Australia Exchange      AU         XASX       Australia      AUD          AU         AUS
60         Colombo Stock Exchange      CM         XCOL       Sri Lanka      LKR          LK         LKA
61                 Vietnam Stocks      VN         HSTC         Vietnam      VND          VN         VNM
62          Kuala Lumpur Exchange    KLSE         XKLS        Malaysia      MYR          MY         MYS
63          Buenos Aires Exchange      BA         XBUE       Argentina      ARS          AR         ARG
64       Bucharest Stock Exchange      RO         XBSE         Romania      RON          RO         ROU
65             Sao Paolo Exchange      SA         BVMF          Brazil      BRL          BR         BRA
66               Mexican Exchange      MX         XMEX          Mexico      MXN          MX         MEX
67                      London IL      IL         XLON              UK      USD          GB         GBR
68          Zagreb Stock Exchange     ZSE         XZAG         Croatia      EUR          HR         HRV
69            Taiwan OTC Exchange     TWO         ROCO          Taiwan      TWD          TW         TWN
70            MICEX Moscow Russia     MCX         None          Russia      RUB          RU         RUS
71                Taiwan Exchange      TW         XTAI          Taiwan      TWD          TW         TWN
72       Bolsa de Valores de Lima     LIM         XLIM            Peru      PEN          PE         PER
73  Money Market Virtual Exchange   MONEY         None         Unknown  Unknown                        
74   Europe Fund Virtual Exchange  EUFUND         None         Unknown      EUR                        
75        Istanbul Stock Exchange      IS         XIST          Turkey      TRY          TR         TUR
76                          FOREX   FOREX         CDSL         Unknown  Unknown                        
77               Cryptocurrencies      CC         CRYP         Unknown  Unknown                        

We can now download all the symbols for the US-based stocks. Below is a sample of the S&P500 stocks but it downloads 48,292 symbols total.

Stocks

Code                                         Name Country Exchange Currency          Type          Isin
195      AAP                       Advance Auto Parts Inc     USA     NYSE      USD  Common Stock  US00751Y1064
199     AAPL                                    Apple Inc     USA   NASDAQ      USD  Common Stock  US0378331005
245     ABBV                                   AbbVie Inc     USA     NYSE      USD  Common Stock  US00287Y1091
366      ABT                          Abbott Laboratories     USA     NYSE      USD  Common Stock  US0028241000
575      ACN                                Accenture plc     USA     NYSE      USD  Common Stock  IE00B4BNMY34
697     ADBE                   Adobe Systems Incorporated     USA   NASDAQ      USD  Common Stock  US00724F1012
746      ADI                           Analog Devices Inc     USA   NASDAQ      USD  Common Stock  US0326541051
759      ADM               Archer-Daniels-Midland Company     USA     NYSE      USD  Common Stock  US0394831020
787      ADP                Automatic Data Processing Inc     USA   NASDAQ      USD  Common Stock  US0530151036
805     ADSK                                 Autodesk Inc     USA   NASDAQ      USD  Common Stock  US0527691069
880      AEE                                  Ameren Corp     USA     NYSE      USD  Common Stock  US0236081024
924      AEP          American Electric Power Company Inc     USA   NASDAQ      USD  Common Stock  US0255371017
935      AES                          The AES Corporation     USA     NYSE      USD  Common Stock  US00130H1059
1032     AFL                           Aflac Incorporated     USA     NYSE      USD  Common Stock  US0010551028
1423     AIG             American International Group Inc     USA     NYSE      USD  Common Stock  US0268747849
1539     AIZ                                 Assurant Inc     USA     NYSE      USD  Common Stock  US04621X1081
1541     AJG                      Arthur J Gallagher & Co     USA     NYSE      USD  Common Stock  US3635761097
1556    AKAM                      Akamai Technologies Inc     USA   NASDAQ      USD  Common Stock  US00971T1016
1598     ALB                               Albemarle Corp     USA     NYSE      USD  Common Stock  US0126531013
1660    ALGN                         Align Technology Inc     USA   NASDAQ      USD  Common Stock  US0162551016
1682     ALK                         Alaska Air Group Inc     USA     NYSE      USD  Common Stock  US0116591092
1689     ALL                     The Allstate Corporation     USA     NYSE      USD  Common Stock  US0200021014
1694    ALLE                                 Allegion PLC     USA     NYSE      USD  Common Stock  IE00BFRT3W74
1863     AMD                   Advanced Micro Devices Inc     USA   NASDAQ      USD  Common Stock  US0079031078
1871     AME                                   Ametek Inc     USA     NYSE      USD  Common Stock  US0311001004
1893    AMGN                                    Amgen Inc     USA   NASDAQ      USD  Common Stock  US0311621009
1959     AMP                     Ameriprise Financial Inc     USA     NYSE      USD  Common Stock  US03076C1062
1996     AMT                          American Tower Corp     USA     NYSE      USD  Common Stock  US03027X1000
2029    AMZN                               Amazon.com Inc     USA   NASDAQ      USD  Common Stock  US0231351067
2132    ANSS                                    ANSYS Inc     USA   NASDAQ      USD  Common Stock  US03662Q1058
2196     AON                                      Aon PLC     USA     NYSE      USD  Common Stock  IE00BLP1HW54
2232     APA                              APA Corporation     USA   NASDAQ      USD  Common Stock  US03743Q1085
2252     APD               Air Products and Chemicals Inc     USA     NYSE      USD  Common Stock  US0091581068
2302     APH                         Amphenol Corporation     USA     NYSE      USD  Common Stock  US0320951017
2417    APTV                                    Aptiv PLC     USA     NYSE      USD  Common Stock  JE00B783TY65
2535     ARE          Alexandria Real Estate Equities Inc     USA     NYSE      USD  Common Stock  US0152711091
3146     ATO                     Atmos Energy Corporation     USA     NYSE      USD  Common Stock  US0495601058
3191    ATVI                      Activision Blizzard Inc     USA   NASDAQ      USD  Common Stock  US00507V1098
3320     AVB                    AvalonBay Communities Inc     USA     NYSE      USD  Common Stock  US0534841012
3363    AVGO                                 Broadcom Inc     USA   NASDAQ      USD  Common Stock  US11135F1012
3480     AWK                         American Water Works     USA     NYSE      USD  Common Stock  US0304201033
3545     AXP                     American Express Company     USA     NYSE      USD  Common Stock  US0258161092
3624     AZO                                 AutoZone Inc     USA     NYSE      USD  Common Stock  US0533321024
3647      BA                           The Boeing Company     USA     NYSE      USD  Common Stock  US0970231058
3659     BAC                         Bank of America Corp     USA     NYSE      USD  Common Stock  US0605051046
3911    BBWI                       Bath & Body Works Inc.     USA     NYSE      USD  Common Stock  US0708301041
4131     BDX                 Becton Dickinson and Company     USA     NYSE      USD  Common Stock  US0758871091
4177     BEN                       Franklin Resources Inc     USA     NYSE      USD  Common Stock  US3546131018
4550    BIIB                                   Biogen Inc     USA   NASDAQ      USD  Common Stock  US09062X1037
4584     BIO                     Bio-Rad Laboratories Inc     USA     NYSE      USD  Common Stock  US0905722072
4741    BKNG                         Booking Holdings Inc     USA   NASDAQ      USD  Common Stock  US09857L1089
4749     BKR                              Baker Hughes Co     USA   NASDAQ      USD  Common Stock  US05722G1004
4843     BLK                                BlackRock Inc     USA     NYSE      USD  Common Stock  US09247X1019
5020     BMY                 Bristol-Myers Squibb Company     USA     NYSE      USD  Common Stock  US1101221083
5268      BR           Broadridge Financial Solutions Inc     USA     NYSE      USD  Common Stock  US11133T1034
5345   BRK-B                       Berkshire Hathaway Inc     USA     NYSE      USD  Common Stock  US0846707026
5622     BSX                       Boston Scientific Corp     USA     NYSE      USD  Common Stock  US1011371077
5847     BWA                               BorgWarner Inc     USA     NYSE      USD  Common Stock  US0997241064
5912     BXP                        Boston Properties Inc     USA     NYSE      USD  Common Stock  US1011211018
6095     CAG                            ConAgra Foods Inc     USA     NYSE      USD  Common Stock  US2058871029
6217    CARR                          Carrier Global Corp     USA     NYSE      USD  Common Stock  US14448C1045
6237     CAT                              Caterpillar Inc     USA     NYSE      USD  Common Stock  US1491231015
6282      CB                                    Chubb Ltd     USA     NYSE      USD  Common Stock  CH0044328745
6399    CBOE                      Cboe Global Markets Inc     USA     BATS      USD  Common Stock  US12503M1080
6411    CBRE                       CBRE Group Inc Class A     USA     NYSE      USD  Common Stock  US12504L1098
6539     CCI                                 Crown Castle     USA     NYSE      USD  Common Stock  US22822V1017
6552     CCL                         Carnival Corporation     USA     NYSE      USD  Common Stock  PA1436583006
6752     CDW                                     CDW Corp     USA   NASDAQ      USD  Common Stock  US12514G1085
6765      CE                         Celanese Corporation     USA     NYSE      USD  Common Stock  US1508701034
6929     CFG                 Citizens Financial Group Inc     USA     NYSE      USD  Common Stock  US1746101054
7343    CHRW                    CH Robinson Worldwide Inc     USA   NASDAQ      USD  Common Stock  US12541W2098
7362    CHTR                   Charter Communications Inc     USA   NASDAQ      USD  Common Stock  US16119P1084
7394      CI                                   Cigna Corp     USA     NYSE      USD  Common Stock  US1255231003
7488    CINF             Cincinnati Financial Corporation     USA   NASDAQ      USD  Common Stock  US1720621010
7607      CL                    Colgate-Palmolive Company     USA     NYSE      USD  Common Stock  US1941621039
7811     CLX                           The Clorox Company     USA     NYSE      USD  Common Stock  US1890541097
7864   CMCSA                                 Comcast Corp     USA   NASDAQ      USD  Common Stock  US20030N1019
7883     CME                                CME Group Inc     USA   NASDAQ      USD  Common Stock  US12572Q1058
7901     CMG                   Chipotle Mexican Grill Inc     USA     NYSE      USD  Common Stock  US1696561059
7921     CMI                                  Cummins Inc     USA     NYSE      USD  Common Stock  US2310211063
8002     CMS                       CMS Energy Corporation     USA     NYSE      USD  Common Stock  US1258961002
8071     CNC                                 Centene Corp     USA     NYSE      USD  Common Stock  US15135B1017
8159     CNP                       CenterPoint Energy Inc     USA     NYSE      USD  Common Stock  US15189T1079
8285     COF            Capital One Financial Corporation     USA     NYSE      USD  Common Stock  US14040H1059
8361     COO      The Cooper Companies, Inc. Common Stock     USA   NASDAQ      USD  Common Stock  US2166484020
8368     COP                               ConocoPhillips     USA     NYSE      USD  Common Stock  US20825C1045
8402    COST                        Costco Wholesale Corp     USA   NASDAQ      USD  Common Stock  US22160K1051
8411    COTY                                     Coty Inc     USA     NYSE      USD  Common Stock  US2220702037
8441     CPB                        Campbell Soup Company     USA     NYSE      USD  Common Stock  US1344291091
8527    CPRI                           Capri Holdings Ltd     USA     NYSE      USD  Common Stock  VGG1890L1076
8531    CPRT                                   Copart Inc     USA   NASDAQ      USD  Common Stock  US2172041061
8705     CRM                           Salesforce.com Inc     USA     NYSE      USD  Common Stock  US79466L3024
8831    CSCO                            Cisco Systems Inc     USA   NASDAQ      USD  Common Stock  US17275R1023
9005     CSX                              CSX Corporation     USA   NASDAQ      USD  Common Stock  US1264081035
9025    CTAS                           Cintas Corporation     USA   NASDAQ      USD  Common Stock  US1729081059
9084    CTLT                                 Catalent Inc     USA     NYSE      USD  Common Stock  US1488061029
9127    CTSH  Cognizant Technology Solutions Corp Class A     USA   NASDAQ      USD  Common Stock  US1924461023
9144    CTVA                                  Corteva Inc     USA     NYSE      USD  Common Stock  US22052L1044
9291     CVS                              CVS Health Corp     USA     NYSE      USD  Common Stock  US1266501006
9315     CVX                                 Chevron Corp     USA     NYSE      USD  Common Stock  US1667641005
9491       D                          Dominion Energy Inc     USA     NYSE      USD  Common Stock  US25746U1097
9734      DD                        Dupont De Nemours Inc     USA     NYSE      USD  Common Stock  US26614N1028
9782      DE                              Deere & Company     USA     NYSE      USD  Common Stock  US2441991054
9985     DFS                  Discover Financial Services     USA     NYSE      USD  Common Stock  US2547091080
10019     DG                   Dollar General Corporation     USA     NYSE      USD  Common Stock  US2566771059
10111    DGX               Quest Diagnostics Incorporated     USA     NYSE      USD  Common Stock  US74834L1008
10138    DHI                                DR Horton Inc     USA     NYSE      USD  Common Stock  US23331A1097
10168    DHR                          Danaher Corporation     USA     NYSE      USD  Common Stock  US2358511028
10255    DIS                          Walt Disney Company     USA     NYSE      USD  Common Stock  US2546871060
10261   DISH                     DISH Network Corporation     USA   NASDAQ      USD  Common Stock  US25470M1099
10376    DLR                     Digital Realty Trust Inc     USA     NYSE      USD  Common Stock  US2538681030
10392   DLTR                              Dollar Tree Inc     USA   NASDAQ      USD  Common Stock  US2567461080
10558   DOCU                                 DocuSign Inc     USA   NASDAQ      USD  Common Stock  US2561631068
10598    DOV                            Dover Corporation     USA     NYSE      USD  Common Stock  US2600031080
10704    DRI                       Darden Restaurants Inc     USA     NYSE      USD  Common Stock  US2371941053
10882    DTE                           DTE Energy Company     USA     NYSE      USD  Common Stock  US2333311072
10961    DUK                      Duke Energy Corporation     USA     NYSE      USD  Common Stock  US26441C2044
11108    DXC                            DXC Technology Co     USA     NYSE      USD  Common Stock  US23355L1061
11110   DXCM                                   DexCom Inc     USA   NASDAQ      USD  Common Stock  US2521311074
11154     EA                          Electronic Arts Inc     USA   NASDAQ      USD  Common Stock  US2855121099
11244   EBAY                                     eBay Inc     USA   NASDAQ      USD  Common Stock  US2786421030
11342    ECL                                   Ecolab Inc     USA     NYSE      USD  Common Stock  US2788651006
11403     ED                      Consolidated Edison Inc     USA     NYSE      USD  Common Stock  US2091151041
11552    EFX                                  Equifax Inc     USA     NYSE      USD  Common Stock  US2944291051
11734    EIX                         Edison International     USA     NYSE      USD  Common Stock  US2810201077
11791     EL                   Estee Lauder Companies Inc     USA     NYSE      USD  Common Stock  US5184391044
11981    EMN                     Eastman Chemical Company     USA     NYSE      USD  Common Stock  US2774321002
12003    EMR                     Emerson Electric Company     USA     NYSE      USD  Common Stock  US2910111044
12142    EOG                            EOG Resources Inc     USA     NYSE      USD  Common Stock  US26875P1012
12243   EQIX                                  Equinix Inc     USA   NASDAQ      USD  Common Stock  US29444U7000
12262    EQR                           Equity Residential     USA     NYSE      USD  Common Stock  US29476L1070
12340     ES                            Eversource Energy     USA     NYSE      USD  Common Stock  US30040W1080
12458    ESS                     Essex Property Trust Inc     USA     NYSE      USD  Common Stock  US2971781057
12549    ETN                        Eaton Corporation PLC     USA     NYSE      USD  Common Stock  IE00B8KQN827
12568    ETR                          Entergy Corporation     USA     NYSE      USD  Common Stock  US29364G1031
12726     EW                    Edwards Lifesciences Corp     USA     NYSE      USD  Common Stock  US28176E1082
12774    EXC                           Exelon Corporation     USA   NASDAQ      USD  Common Stock  US30161N1019
12812   EXPD   Expeditors International of Washington Inc     USA   NASDAQ      USD  Common Stock  US3021301094
12813   EXPE                           Expedia Group Inc.     USA   NASDAQ      USD  Common Stock  US30212P3038
13042   FAST                             Fastenal Company     USA   NASDAQ      USD  Common Stock  US3119001044
13456    FCX           Freeport-McMoran Copper & Gold Inc     USA     NYSE      USD  Common Stock  US35671D8570
13625     FE                      FirstEnergy Corporation     USA     NYSE      USD  Common Stock  US3379321074
14285    FHN           First Horizon National Corporation     USA     NYSE      USD  Common Stock  US3205171057
14612    FIS   Fidelity National Information Services Inc     USA     NYSE      USD  Common Stock  US31620M1062
14629   FITB                          Fifth Third Bancorp     USA   NASDAQ      USD  Common Stock  US3167731005
14944    FLR                            Fluor Corporation     USA     NYSE      USD  Common Stock  US3434121022
14972    FLS                        Flowserve Corporation     USA     NYSE      USD  Common Stock  US34354P1057
14990    FLT                    Fleetcor Technologies Inc     USA     NYSE      USD  Common Stock  US3390411052
15068    FMC                              FMC Corporation     USA     NYSE      USD  Common Stock  US3024913036
15539   FOSL                             Fossil Group Inc     USA   NASDAQ      USD  Common Stock  US34988V1061
15555    FOX                             Fox Corp Class B     USA   NASDAQ      USD  Common Stock  US35137L2043
15556   FOXA                             Fox Corp Class A     USA   NASDAQ      USD  Common Stock  US35137L1052
15886    FRT              Federal Realty Investment Trust     USA     NYSE      USD  Common Stock  US3137451015
16421    FTI                               TechnipFMC PLC     USA     NYSE      USD  Common Stock  GB00BDSFG982
16517    FTV                                 Fortive Corp     USA     NYSE      USD  Common Stock  US34959J1088
17233     GD                 General Dynamics Corporation     USA     NYSE      USD  Common Stock  US3695501086
17312     GE                     General Electric Company     USA     NYSE      USD  Common Stock  US3696043013
17731   GILD                          Gilead Sciences Inc     USA   NASDAQ      USD  Common Stock  US3755581036
17830     GL                               Globe Life Inc     USA     NYSE      USD  Common Stock  US37959E1029
17990    GLW                         Corning Incorporated     USA     NYSE      USD  Common Stock  US2193501051
18306   GOOG                         Alphabet Inc Class C     USA   NASDAQ      USD  Common Stock  US02079K1079
18307  GOOGL                         Alphabet Inc Class A     USA   NASDAQ      USD  Common Stock  US02079K3059
18352    GPC                             Genuine Parts Co     USA     NYSE      USD  Common Stock  US3724601055
18425    GPS                                      Gap Inc     USA     NYSE      USD  Common Stock  US3647601083
18545   GRMN                                   Garmin Ltd     USA     NYSE      USD  Common Stock  CH0114405324
18618     GS                      Goldman Sachs Group Inc     USA     NYSE      USD  Common Stock  US38141G1040
19147    GWW                              WW Grainger Inc     USA     NYSE      USD  Common Stock  US3848021040
19284    HAS                                   Hasbro Inc     USA   NASDAQ      USD  Common Stock  US4180561072
19315   HBAN           Huntington Bancshares Incorporated     USA   NASDAQ      USD  Common Stock  US4461501045
19333    HBI                              Hanesbrands Inc     USA     NYSE      USD  Common Stock  US4103451021
19362    HCA                             HCA Holdings Inc     USA     NYSE      USD  Common Stock  US40412C1018
19458     HD                               Home Depot Inc     USA     NYSE      USD  Common Stock  US4370761029
19602    HES                             Hess Corporation     USA     NYSE      USD  Common Stock  US42809H1077
19827    HIG            Hartford Financial Services Group     USA     NYSE      USD  Common Stock  US4165151048
19839    HII            Huntington Ingalls Industries Inc     USA     NYSE      USD  Common Stock  US4464131063
20022    HLT                Hilton Worldwide Holdings Inc     USA     NYSE      USD  Common Stock  US43300A2033
20184    HON                  Honeywell International Inc     USA   NASDAQ      USD  Common Stock  US4385161066
20275    HRB                                H&R Block Inc     USA     NYSE      USD  Common Stock  US0936711052
20304    HRL                     Hormel Foods Corporation     USA     NYSE      USD  Common Stock  US4404521001
20445    HST                    Host Hotels & Resorts Inc     USA   NASDAQ      USD  Common Stock  US44107P1049
20472    HSY                                   Hershey Co     USA     NYSE      USD  Common Stock  US4278661081
20572    HUM                                   Humana Inc     USA     NYSE      USD  Common Stock  US4448591028
20659    HWM                         Howmet Aerospace Inc     USA     NYSE      USD  Common Stock  US4432011082
20908    IBM              International Business Machines     USA     NYSE      USD  Common Stock  US4592001014
20984    ICE                Intercontinental Exchange Inc     USA     NYSE      USD  Common Stock  US45866F1049
21160   IDXX                       IDEXX Laboratories Inc     USA   NASDAQ      USD  Common Stock  US45168D1046
21225    IEX                             IDEX Corporation     USA     NYSE      USD  Common Stock  US45167R1041
21242    IFF       International Flavors & Fragrances Inc     USA     NYSE      USD  Common Stock  US4595061015
21580   ILMN                                 Illumina Inc     USA   NASDAQ      USD  Common Stock  US4523271090
21744   INCY                           Incyte Corporation     USA   NASDAQ      USD  Common Stock  US45337C1027
21861   INTC                            Intel Corporation     USA   NASDAQ      USD  Common Stock  US4581401001
21875   INTU                                   Intuit Inc     USA   NASDAQ      USD  Common Stock  US4612021034
21959     IP                          International Paper     USA     NYSE      USD  Common Stock  US4601461035
21986    IPG           Interpublic Group of Companies Inc     USA     NYSE      USD  Common Stock  US4606901001
22076    IQV                           IQVIA Holdings Inc     USA     NYSE      USD  Common Stock  US46266C1053
22077     IR                           Ingersoll Rand Inc     USA     NYSE      USD  Common Stock  US45687V1061
22145    IRM                   Iron Mountain Incorporated     USA     NYSE      USD  Common Stock  US46284V1017
22356   ISRG                       Intuitive Surgical Inc     USA   NASDAQ      USD  Common Stock  US46120E6023
22426     IT                                  Gartner Inc     USA     NYSE      USD  Common Stock  US3666511072
22525    ITW                      Illinois Tool Works Inc     USA     NYSE      USD  Common Stock  US4523081093
22670    IVZ                                  Invesco Plc     USA     NYSE      USD  Common Stock  BMG491BT1088
22785      J                        Jacobs Solutions Inc.     USA     NYSE      USD  Common Stock  US46982L1089
23656   JKHY                  Jack Henry & Associates Inc     USA   NASDAQ      USD  Common Stock  US4262811015
23861    JNJ                            Johnson & Johnson     USA     NYSE      USD  Common Stock  US4781601046
23870   JNPR                         Juniper Networks Inc     USA     NYSE      USD  Common Stock  US48203R1041
23985    JPM                          JPMorgan Chase & Co     USA     NYSE      USD  Common Stock  US46625H1005
24450    JWN                                Nordstrom Inc     USA     NYSE      USD  Common Stock  US6556641008
24482      K                                    Kellanova     USA     NYSE      USD  Common Stock  US4878361082
24640   KEYS                    Keysight Technologies Inc     USA     NYSE      USD  Common Stock  US49338L1035
24689    KHC                               Kraft Heinz Co     USA   NASDAQ      USD  Common Stock  US5007541064
24716    KIM                     Kimco Realty Corporation     USA     NYSE      USD  Common Stock  US49446R1095
24748   KLAC                       KLA-Tencor Corporation     USA   NASDAQ      USD  Common Stock  US4824801009
24790    KMB                   Kimberly-Clark Corporation     USA     NYSE      USD  Common Stock  US4943681035
24805    KMI                            Kinder Morgan Inc     USA     NYSE      USD  Common Stock  US49456B1017
24876     KO                        The Coca-Cola Company     USA     NYSE      USD  Common Stock  US1912161007
24999    KSS                                   Kohls Corp     USA     NYSE      USD  Common Stock  US5002551043
25507    LEG                 Leggett & Platt Incorporated     USA     NYSE      USD  Common Stock  US5246601075
25543    LEN                           Lennar Corporation     USA     NYSE      USD  Common Stock  US5260571048
25780     LH   Laboratory Corporation of America Holdings     USA     NYSE      USD  Common Stock  US50540R4092
25800    LHX                    L3Harris Technologies Inc     USA     NYSE      USD  Common Stock  US5024311095
25904    LIN                                    Linde PLC     USA     NYSE      USD  Common Stock  IE000S9YS762
26011    LKQ                              LKQ Corporation     USA   NASDAQ      USD  Common Stock  US5018892084
26044    LLY                        Eli Lilly and Company     USA     NYSE      USD  Common Stock  US5324571083
26152    LMT                  Lockheed Martin Corporation     USA     NYSE      USD  Common Stock  US5398301094
26200    LNT                          Alliant Energy Corp     USA   NASDAQ      USD  Common Stock  US0188021085
26280    LOW                         Lowe's Companies Inc     USA     NYSE      USD  Common Stock  US5486611073
26399   LRCX                            Lam Research Corp     USA   NASDAQ      USD  Common Stock  US5128071082
26714   LUMN                       Lumen Technologies Inc     USA     NYSE      USD  Common Stock  US5502411037
26780    LVS                         Las Vegas Sands Corp     USA     NYSE      USD  Common Stock  US5178341070
26789     LW                     Lamb Weston Holdings Inc     USA     NYSE      USD  Common Stock  US5132721045
26823    LYB                 LyondellBasell Industries NV     USA     NYSE      USD  Common Stock  NL0009434992
26877      M                                   Macy’s Inc     USA     NYSE      USD  Common Stock  US55616P1049
26878     MA                               Mastercard Inc     USA     NYSE      USD  Common Stock  US57636Q1040
27044    MAR                   Marriott International Inc     USA   NASDAQ      USD  Common Stock  US5719032022
27064    MAS                            Masco Corporation     USA     NYSE      USD  Common Stock  US5745991068
27272    MCD                       McDonald’s Corporation     USA     NYSE      USD  Common Stock  US5801351017
27314   MCHP                     Microchip Technology Inc     USA   NASDAQ      USD  Common Stock  US5950171042
27356    MCO                           Moodys Corporation     USA     NYSE      USD  Common Stock  US6153691059
27515   MDLZ                   Mondelez International Inc     USA   NASDAQ      USD  Common Stock  US6092071058
27565    MDT                                Medtronic PLC     USA     NYSE      USD  Common Stock  IE00BTN1Y115
27722    MET                                  MetLife Inc     USA     NYSE      USD  Common Stock  US59156R1086
28102    MHK                        Mohawk Industries Inc     USA     NYSE      USD  Common Stock  US6081901042
28365    MKC             McCormick & Company Incorporated     USA     NYSE      USD  Common Stock  US5797802064
28409   MKTX                     MarketAxess Holdings Inc     USA   NASDAQ      USD  Common Stock  US57060D1081
28476    MLM                Martin Marietta Materials Inc     USA     NYSE      USD  Common Stock  US5732841060
28594    MMC               Marsh & McLennan Companies Inc     USA     NYSE      USD  Common Stock  US5717481023
28693    MMM                                   3M Company     USA     NYSE      USD  Common Stock  US88579Y1010
28937   MNST                        Monster Beverage Corp     USA   NASDAQ      USD  Common Stock  US61174X1090
28971     MO                                 Altria Group     USA     NYSE      USD  Common Stock  US02209S1033
29057    MOS                           The Mosaic Company     USA     NYSE      USD  Common Stock  US61945C1036
29323    MRK                          Merck & Company Inc     USA     NYSE      USD  Common Stock  US58933Y1055
29416     MS                               Morgan Stanley     USA     NYSE      USD  Common Stock  US6174464486
29467   MSCI                                     MSCI Inc     USA     NYSE      USD  Common Stock  US55354G1004
29525   MSFT                        Microsoft Corporation     USA   NASDAQ      USD  Common Stock  US5949181045
29710    MTB                                M&T Bank Corp     USA     NYSE      USD  Common Stock  US55261F1049
29729    MTD             Mettler-Toledo International Inc     USA     NYSE      USD  Common Stock  US5926881054
29835     MU                        Micron Technology Inc     USA   NASDAQ      USD  Common Stock  US5951121038
30457   NCLH           Norwegian Cruise Line Holdings Ltd     USA     NYSE      USD  Common Stock  BMG667211046
30501   NDAQ                                   Nasdaq Inc     USA   NASDAQ      USD  Common Stock  US6311031081
30563    NEE                           Nextera Energy Inc     USA     NYSE      USD  Common Stock  US65339F1012
30694   NFLX                                  Netflix Inc     USA   NASDAQ      USD  Common Stock  US64110L1061
30793     NI                                 NiSource Inc     USA     NYSE      USD  Common Stock  US65473P1057
31121    NOV                                     NOV Inc.     USA     NYSE      USD  Common Stock  US62955J1034
31132    NOW                               ServiceNow Inc     USA     NYSE      USD  Common Stock  US81762P1021
31307    NSC                 Norfolk Southern Corporation     USA     NYSE      USD  Common Stock  US6558441084
31466   NTRS                   Northern Trust Corporation     USA   NASDAQ      USD  Common Stock  US6658591044
31491    NUE                                   Nucor Corp     USA     NYSE      USD  Common Stock  US6703461052
31564   NVDA                           NVIDIA Corporation     USA   NASDAQ      USD  Common Stock  US67066G1040
31600    NVR                                      NVR Inc     USA     NYSE      USD  Common Stock  US62944T1051
31725    NWL                            Newell Brands Inc     USA   NASDAQ      USD  Common Stock  US6512291062
31758    NWS                                  News Corp B     USA   NASDAQ      USD  Common Stock  US65249B2088
31759   NWSA                                  News Corp A     USA   NASDAQ      USD  Common Stock  US65249B1098
32070   ODFL                Old Dominion Freight Line Inc     USA   NASDAQ      USD  Common Stock  US6795801009
32082    ODP                                     ODP Corp     USA   NASDAQ      USD  Common Stock  US88337F1057
32309    OMC                            Omnicom Group Inc     USA     NYSE      USD  Common Stock  US6819191064
32517    ORI              Old Republic International Corp     USA     NYSE      USD  Common Stock  US6802231042
32527   ORLY                      O’Reilly Automotive Inc     USA   NASDAQ      USD  Common Stock  US67103H1077
32746    OXY             Occidental Petroleum Corporation     USA     NYSE      USD  Common Stock  US6745991058
32981   PANW                       Palo Alto Networks Inc     USA   NASDAQ      USD  Common Stock  US6974351057
33075   PAYX                                  Paychex Inc     USA   NASDAQ      USD  Common Stock  US7043261079
33206   PCAR                                   PACCAR Inc     USA   NASDAQ      USD  Common Stock  US6937181088
33588   PEAK                    Healthpeak Properties Inc     USA     NYSE      USD  Common Stock  US42250P1030
33682    PEP                                  PepsiCo Inc     USA   NASDAQ      USD  Common Stock  US7134481081
33750    PFE                                   Pfizer Inc     USA     NYSE      USD  Common Stock  US7170811035
33767    PFG                Principal Financial Group Inc     USA   NASDAQ      USD  Common Stock  US74251V1026
33876     PG                     Procter & Gamble Company     USA     NYSE      USD  Common Stock  US7427181091
34000    PGR                             Progressive Corp     USA     NYSE      USD  Common Stock  US7433151039
34060     PH                  Parker-Hannifin Corporation     USA     NYSE      USD  Common Stock  US7010941042
34126    PHM                               PulteGroup Inc     USA     NYSE      USD  Common Stock  US7458671010
34476    PLD                                 Prologis Inc     USA     NYSE      USD  Common Stock  US74340W1036
34639     PM              Philip Morris International Inc     USA     NYSE      USD  Common Stock  US7181721090
34814    PNC             PNC Financial Services Group Inc     USA     NYSE      USD  Common Stock  US6934751057
34879    PNW                   Pinnacle West Capital Corp     USA     NYSE      USD  Common Stock  US7234841010
34951   POOL                             Pool Corporation     USA   NASDAQ      USD  Common Stock  US73278L1052
35015    PPG                           PPG Industries Inc     USA     NYSE      USD  Common Stock  US6935061076
35033    PPL                              PPL Corporation     USA     NYSE      USD  Common Stock  US69351T1060
35232   PRGO                          Perrigo Company PLC     USA     NYSE      USD  Common Stock  IE00BGH1M568
35479    PSA                               Public Storage     USA     NYSE      USD  Common Stock  US74460D1090
35992    PVH                                     PVH Corp     USA     NYSE      USD  Common Stock  US6936561009
36056    PWR                          Quanta Services Inc     USA     NYSE      USD  Common Stock  US74762E1029
36190   PYPL                          PayPal Holdings Inc     USA   NASDAQ      USD  Common Stock  US70450Y1038
36295   QCOM                        Qualcomm Incorporated     USA   NASDAQ      USD  Common Stock  US7475251036
36528   QRVO                                    Qorvo Inc     USA   NASDAQ      USD  Common Stock  US74736K1016
36919    RCL                  Royal Caribbean Cruises Ltd     USA     NYSE      USD  Common Stock  LR0008862868
37109   REGN                Regeneron Pharmaceuticals Inc     USA   NASDAQ      USD  Common Stock  US75886F1075
37621    RJF                 Raymond James Financial Inc.     USA     NYSE      USD  Common Stock  US7547301090
37649     RL                    Ralph Lauren Corp Class A     USA     NYSE      USD  Common Stock  US7512121010
37740    RMD                                   ResMed Inc     USA     NYSE      USD  Common Stock  US7611521078
37937    ROK                      Rockwell Automation Inc     USA     NYSE      USD  Common Stock  US7739031091
37941    ROL                                  Rollins Inc     USA     NYSE      USD  Common Stock  US7757111049
37949    ROP         Roper Technologies Inc. Common Stock     USA   NASDAQ      USD  Common Stock  US7766961061
37961   ROST                              Ross Stores Inc     USA   NASDAQ      USD  Common Stock  US7782961038
38180    RSG                        Republic Services Inc     USA     NYSE      USD  Common Stock  US7607591002
38335    RTX                   Raytheon Technologies Corp     USA     NYSE      USD  Common Stock  US75513E1010
39030   SBUX                        Starbucks Corporation     USA   NASDAQ      USD  Common Stock  US8552441094
39132   SCHW                          Charles Schwab Corp     USA     NYSE      USD  Common Stock  US8085131055
39422    SEE                       Sealed Air Corporation     USA     NYSE      USD  Common Stock  US81211K1007
39940    SHW                          Sherwin-Williams Co     USA     NYSE      USD  Common Stock  US8243481061
40128    SJM                           JM Smucker Company     USA     NYSE      USD  Common Stock  US8326964058
40243    SLG                         SL Green Realty Corp     USA     NYSE      USD  Common Stock  US78440X8873
40572    SNA                                  Snap-On Inc     USA     NYSE      USD  Common Stock  US8330341012
40668   SNPS                                 Synopsys Inc     USA   NASDAQ      USD  Common Stock  US8716071076
40732     SO                             Southern Company     USA     NYSE      USD  Common Stock  US8425871071
40898    SPG                     Simon Property Group Inc     USA     NYSE      USD  Common Stock  US8288061091
40902   SPGI                               S&P Global Inc     USA     NYSE      USD  Common Stock  US78409V1044
41139    SRE                                Sempra Energy     USA     NYSE      USD  Common Stock  US8168511090
41518    STE                                   STERIS plc     USA     NYSE      USD  Common Stock  IE00BFY8C754
41964    SWK                   Stanley Black & Decker Inc     USA     NYSE      USD  Common Stock  US8545021011
41968   SWKS                       Skyworks Solutions Inc     USA   NASDAQ      USD  Common Stock  US83088M1027
42066    SYF                          Synchrony Financial     USA     NYSE      USD  Common Stock  US87165B1035
42077    SYK                          Stryker Corporation     USA     NYSE      USD  Common Stock  US8636671013
42139      T                                     AT&T Inc     USA     NYSE      USD  Common Stock  US00206R1023
42239    TAP              Molson Coors Brewing Co Class B     USA     NYSE      USD  Common Stock  US60871R2094
42577    TDG                 Transdigm Group Incorporated     USA     NYSE      USD  Common Stock  US8936411003
42725    TER                                 Teradyne Inc     USA   NASDAQ      USD  Common Stock  US8807701029
42771    TFC                        Truist Financial Corp     USA     NYSE      USD  Common Stock  US89832Q1094
42836    TFX                        Teleflex Incorporated     USA     NYSE      USD  Common Stock  US8793691069
42943    TGT                           Target Corporation     USA     NYSE      USD  Common Stock  US87612E1064
43273    TJX                        The TJX Companies Inc     USA     NYSE      USD  Common Stock  US8725401090
43530    TMO                 Thermo Fisher Scientific Inc     USA     NYSE      USD  Common Stock  US8835561023
43557   TMUS                              T-Mobile US Inc     USA   NASDAQ      USD  Common Stock  US8725901040
44030   TROW                      T. Rowe Price Group Inc     USA   NASDAQ      USD  Common Stock  US74144T1088
44139    TRV                  The Travelers Companies Inc     USA     NYSE      USD  Common Stock  US89417E1091
44302   TSLA                                    Tesla Inc     USA   NASDAQ      USD  Common Stock  US88160R1014
44409     TT                       Trane Technologies plc     USA     NYSE      USD  Common Stock  IE00BK9ZQ967
44490   TTWO            Take-Two Interactive Software Inc     USA   NASDAQ      USD  Common Stock  US8740541094
44674    TXN               Texas Instruments Incorporated     USA   NASDAQ      USD  Common Stock  US8825081040
44683    TXT                                  Textron Inc     USA     NYSE      USD  Common Stock  US8832031012
44725     UA                           Under Armour Inc C     USA     NYSE      USD  Common Stock  US9043112062
44726    UAA                           Under Armour Inc A     USA     NYSE      USD  Common Stock  US9043111072
44731    UAL                 United Airlines Holdings Inc     USA   NASDAQ      USD  Common Stock  US9100471096
45004    UNH              UnitedHealth Group Incorporated     USA     NYSE      USD  Common Stock  US91324P1021
45016    UNM                                   Unum Group     USA     NYSE      USD  Common Stock  US91529Y1064
45076    UPS                    United Parcel Service Inc     USA     NYSE      USD  Common Stock  US9113121068
45105    URI                           United Rentals Inc     USA     NYSE      USD  Common Stock  US9113631090
45143    USB                                 U.S. Bancorp     USA     NYSE      USD  Common Stock  US9029733048
45374      V                            Visa Inc. Class A     USA     NYSE      USD  Common Stock  US92826C8394
45603   VEEV                    Veeva Systems Inc Class A     USA     NYSE      USD  Common Stock  US9224751084
45683    VFC                               VF Corporation     USA     NYSE      USD  Common Stock  US9182041080
46017    VLO                    Valero Energy Corporation     USA     NYSE      USD  Common Stock  US91913Y1001
46145    VNO                         Vornado Realty Trust     USA     NYSE      USD  Common Stock  US9290421091
46163    VNT                                 Vontier Corp     USA     NYSE      USD  Common Stock  US9288811014
46320   VRSK                         Verisk Analytics Inc     USA   NASDAQ      USD  Common Stock  US92345Y1064
46321   VRSN                                 VeriSign Inc     USA   NASDAQ      USD  Common Stock  US92343E1029
46337   VRTX                   Vertex Pharmaceuticals Inc     USA   NASDAQ      USD  Common Stock  US92532F1003
46539    VTR                                   Ventas Inc     USA     NYSE      USD  Common Stock  US92276F1003
46819     VZ                   Verizon Communications Inc     USA     NYSE      USD  Common Stock  US92343V1044
46830    WAB     Westinghouse Air Brake Technologies Corp     USA     NYSE      USD  Common Stock  US9297401088
46910    WAT                           Waters Corporation     USA     NYSE      USD  Common Stock  US9418481035
47118    WEC                         WEC Energy Group Inc     USA     NYSE      USD  Common Stock  US92939U1060
47154   WELL                                Welltower Inc     USA     NYSE      USD  Common Stock  US95040Q1040
47351    WHR                        Whirlpool Corporation     USA     NYSE      USD  Common Stock  US9633201069
47500     WM                         Waste Management Inc     USA     NYSE      USD  Common Stock  US94106L1098
47501    WMB                       Williams Companies Inc     USA     NYSE      USD  Common Stock  US9694571004
47553    WMT                                  Walmart Inc     USA     NYSE      USD  Common Stock  US9311421039
47670    WRK                                  WestRock Co     USA     NYSE      USD  Common Stock  US96145D1054
47742    WST             West Pharmaceutical Services Inc     USA     NYSE      USD  Common Stock  US9553061055
47838     WU                             Western Union Co     USA     NYSE      USD  Common Stock  US9598021098
47907     WY                         Weyerhaeuser Company     USA     NYSE      USD  Common Stock  US9621661043
48077    XEL                              Xcel Energy Inc     USA   NASDAQ      USD  Common Stock  US98389B1008
48387    XOM                             Exxon Mobil Corp     USA     NYSE      USD  Common Stock  US30231G1022
48486    XRX                                   Xerox Corp     USA   NASDAQ      USD  Common Stock  US98421M1062
48583    XYL                                    Xylem Inc     USA     NYSE      USD  Common Stock  US98419M1009
48717    YUM                              Yum! Brands Inc     USA     NYSE      USD  Common Stock  US9884981013
48743    ZBH                   Zimmer Biomet Holdings Inc     USA     NYSE      USD  Common Stock  US98956P1021
48897    ZTS                                   Zoetis Inc     USA     NYSE      USD  Common Stock  US98978V1035

Here’s how the code works

Prerequisites

Before we dive in, make sure you have the following installed:

  • Python 3.x
  • Pandas
  • Requests

You’ll also need an API token from EODHD, which you can get by signing up on their website.

Importing Necessary Modules

Let’s start by importing the necessary modules. We’ll need logging for logging, pandas for data manipulation, requests for making HTTP requests, io for handling CSV data streams, and pathlib for file path manipulation.

import logging
import pandas as pd
import requests
import io
from pathlib import Path

Setting Up Logging and Pandas Configuration

We’ll set up logging to the INFO level and configure Pandas to display all columns for our dataframes.

logging.basicConfig(level=logging.INFO)
pd.set_option('display.max_columns', None)
pd.set_option('display.expand_frame_repr', False)

Fetch and Save Exchanges

The first function fetch_and_save_exchanges fetches the list of all stock market exchanges and saves them to an HDF5 file. You can also opt to read this data from a local file if it exists.

def fetch_and_save_exchanges(api_token, data_dir, local_only=False):
    # ... (code)

This function takes an API token and a directory path as its arguments and also has an optional argument local_only to specify whether to use local data if available.

Here’s how to call it:

df = fetch_and_save_exchanges(api_token, data_dir)
logging.info(df.head())
logging.info(df.tail())

Fetch and Save Symbols

The second function fetch_and_save_symbols fetches stock symbols for each exchange and saves them to an HDF5 file.

def fetch_and_save_symbols(api_token, data_dir, exchange_codes):
    # ... (code)

This function iterates over each exchange code and fetches its symbols. Here’s how to call it:

# Uncomment the following line when you want to fetch and save symbols
# fetch_and_save_symbols(api_token, data_dir, exchange_codes)

Inspecting Your Local Database

The final function, print_h5_file_info, allows you to inspect what’s inside your HDF5 file. It lists all the market codes (keys) and can export a specific one to a CSV file.

def print_h5_file_info(h5_file_path, market_code=None, csv_export=False):
    # ... (code)

To check the content of your HDF5 file, simply call:

print_h5_file_info(data_dir + 'symbols.h5', 'US')

The entire code is below

You will need to create a python file key.py to store your api_token variable.

from keys import api_token

data_dir = '/home/shared/algos/data/'

import logging
import pandas as pd
import requests
import io
from pathlib import Path

logging.basicConfig(level=logging.INFO)
# Configure Pandas to display all columns
pd.set_option('display.max_columns', None)
pd.set_option('display.expand_frame_repr', False)

def fetch_and_save_exchanges(api_token, data_dir, local_only=False):
    h5_file_path = Path(data_dir, "exchanges.h5")

    if local_only:
        if h5_file_path.exists():
            df = pd.read_hdf(h5_file_path, key="exchanges")
            print(f"Loaded data from {h5_file_path}")
        else:
            print("Local file does not exist. You might want to fetch it first.")
            return None
    else:
        url = "https://eodhd.com/api/exchanges-list/?api_token=" + api_token
        response = requests.get(url)
        if response.status_code == 200:
            json_data = response.json()
            df = pd.DataFrame(json_data)
            df.to_hdf(h5_file_path, key="exchanges", mode="w")
            print(f"Saved data to {h5_file_path}")
        else:
            print(f"Failed to fetch data. HTTP Status Code: {response.status_code}")
            return None

    return df


df = fetch_and_save_exchanges(api_token, data_dir)
logging.info(df.head())
logging.info(df.tail())

exchange_codes = df['Code'].tolist()


def fetch_and_save_symbols(api_token, data_dir, exchange_codes):
    h5_file_path = Path(data_dir, "symbols.h5")

    for exchange_code in exchange_codes:
        url = f"https://eodhistoricaldata.com/api/exchanges/{exchange_code}?api_token={api_token}"
        response = requests.get(url)

        if response.status_code == 200:
            csv_data = response.text
            csv_stream = io.StringIO(csv_data)

            # Create DataFrame from CSV stream
            df = pd.read_csv(csv_stream)

            # Append DataFrame to H5 file under the key `exchange_code`
            df.to_hdf(h5_file_path, key=f"{exchange_code}", mode='a')
            print(f"Saved data for {exchange_code} to {h5_file_path}")
        else:
            print(f"Failed to fetch data for {exchange_code}. HTTP Status Code: {response.status_code}")


# fetch_and_save_symbols(api_token, data_dir, exchange_codes)


def print_h5_file_info(h5_file_path, market_code=None, csv_export=False):

    h5_file_path = Path(h5_file_path)

    # Check if the file exists
    if not h5_file_path.exists():
        logging.info(f"The file {h5_file_path} does not exist.")
        return

    with pd.HDFStore(h5_file_path, 'r') as store:
        # Log available keys (symbols)
        logging.info("Available market codes in the HDF5 file:")
        for key in store.keys():
            logging.info(key.lstrip('/'))  # Remove the leading '/' from the key name

        # If a market_code is specified, try to read its data
        if market_code:
            if f'/{market_code}' in store.keys():
                # Read the DataFrame for the specific market code
                df = pd.read_hdf(h5_file_path, key=market_code)

                # Log the DataFrame
                logging.info(f"\nData for market code {market_code}:")
                logging.info(df)

                # If csv_export is True, save the DataFrame to a CSV file
                if csv_export:
                    csv_file_path = h5_file_path.parent / f"{market_code}.csv"
                    df.to_csv(csv_file_path, index=False)
                    logging.info(f"DataFrame has been saved to {csv_file_path}")

            else:
                logging.info(f"\nThe market code {market_code} does not exist in the HDF5 file.")

print_h5_file_info(data_dir + 'symbols.h5', 'US')