At times when creating trading strategies using big data you need access to historical economic data. One of the best sources of data is the Economic Research branch of the St. Louis Federal Reserve or FRED. Today I’m going to show you how to pull that data into a dataframe so that you can analyze it using machine learning or AI.
When analyzing historical time frame data in machine learning it needs to be normalized. In this code example I will show how to get S&P data then convert it to a percent of daily increase/decrease as well as a logarithmic daily increase/decrease.
Colab can be found here – https://colab.research.google.com/drive/1Wvh7mRX0PUthBhXw1HD5tJAC8mQr6DYH?usp=sharing
The first part of this code will use yfinance as our datasource.
#we're going to use yfinance as our data source
!pip install yfinance --upgrade --no-cache-dir
import pandas as pd
import numpy as np
import yfinance as yf
Next we’re going to create a dataframe called df and download SPY data from 2000 to it. Finally we’ll print the result of df so you can get an idea of what is inside of it.
#Here we're creating a dataframe for spy data from 2000-current
df = yf.download('spy',
#actions='inline',) #adjust for stock splits and dividends
#print the dataframe to see what lives in it
We’re going to drop all the columns except Adj Close. Then we’ll rename it adj_close. Next we’ll create a column labeled simple_rtn. This is the daily simple return or percent increase/decrease. The next line of code creates a logarithmic increase/decrease. Logarithmic gives equal bearing to the Y axis and can be defined as follows, “A logarithmic price scale uses the percentage of change to plot data points, so, the scale prices are not positioned equidistantly. A linear price scale uses an equal value between price scales providing an equal distance between values.”