If your trading strategy includes a custom indicator or historical data calculations, you will need to warm-up your algorithm upon deployment. This can be achieved by initiating a history request or by loading saved data. In this article I will cover the advantages and disadvantages of both methods and demonstrate how to code them in Python on the QuantConnect platform.
Live History Requests
Live History requests may sound straight forward but can be more complex when utilized in a live trading environment. Important considerations include data source, the number stocks subscribed to, and the frequency at which history requests are called.
If you are pulling your data from a non-brokerage platform, speed and accuracy may be limited compared to a brokerage. However, the data may have a longer history for certain assets such as options compared to retail brokerages. While live history requests from a brokerage can ensure speed and accuracy, they have historical request limits for both quantity and frequency and can also cost to implement.
For example, Interactive Brokers has Pacing Violation protocols which prohibit the following:
Making identical historical data requests within 15 seconds
Making six or more historical data requests for the same Contract, Exchange and Tick Type within 2 seconds
Making more than 60 requests within any 10 minute period
Additionally, based on your data subscription, there are limits to how many assets you can be subscribed to at one time. A standard Interactive Brokers NYSE subscription only allows 100 assets to be subscribed without upgrading. While this is typically not an issue for manual trading, it can quickly become an issue when implementing an automated strategy with a universe of stocks.
How to code a live history request on the QuantConnect Platform:
##Ensure that equity is added during initialize
def __init__(self):
self.equity = self.AddEquity("AAPL", Resolution.Daily)
self.equity.DataNormalizationMode = DataNormalizationMode.Raw
def LiveHistoryRequest(self):
##Pull 20 days of history
history = self.History(self.equity.Symbol, 20, Resolution.Daily)
##Sometimes the history request from QC comes up emtpy
if not history.empty:
##resetting the index removes the default symbol index
history = history.reset_index()
Loading Saved Data
Loading saved data allows you to load all your required data all at once at any time. This mitigates any pacing protocols and enables assurance that you are starting with the exact data required for warm-up. QuantConnect allows you to save and store up to 50 MB of data for free and also offers a paid storage subscription up to 50 GB. The QuantConnect Object Store accepts CSV, Bytes, strings, JSON , and XML-formatted objects.
When dealing with only one asset, data can easily be pulled as a csv and directly uploaded to QuantConnect. However, when uploading Timeseries data from multiple stocks, it is better to avoid saving each stock’s data as an individual object. To save all required data at once, I am using a 3D NumPy array that I will reshape to fit the data into a pandas DataFrame and convert to a CSV for storage.
The first time that you save an object you will have to create the file path which can utilize a folder named as the project ID or be saved with just a filename.
For the next two sections I recommend using the Jupyter Research environment. Instead of 'self' the Research environment uses 'qb' to access the Object Store.
Here’s how:
qb = QuantBook()
def CreateNewObject(data, objectName):
##Data is a pandas DataFrame
data_csv = pd.DataFrame.to_csv(data)
##Save with objectName string
qb.ObjectStore.Save(objectName, data_csv)
##Or save with the project_id to create a folder
qb.ObjectStore.Save(f"{qb.project_id}/WarmupData", data_csv)
##Double Check that your file is saved
for kvp in qb.ObjectStore:
print(kvp)
If you are using a 3D NumPy array with multiple stocks you will need to reshape the array so that it can be saved as 2D object. I recommend also saving the shape as an object to ensure that it will be unpacked correctly in the live environment.
Converting 3D array into 2D .csv:
def Save3Dobject(org_arr):
##Save original shape to the object store
og_shape = org_arr.shape
pd_shape = pd.DataFrame(og_shape)
##Get the full file path by looking up the file name
bpath = qb.ObjectStore.GetFilePath('shape_warmup')
##Save as .csv to file path
pd_shape.to_csv(bpath)
##Convert 3D array to 2D array
resh_arr = org_arr.reshape(org_arr.shape[0],-1)
##Convert 2D array to pandas DataFrame
pd_arr = pd.DataFrame(resh)
#Get the full file path by looking up the file name
mypath = qb.ObjectStore.GetFilePath('WarmupData')
##Save as .csv to file path
pd_arr.to_csv(mypath)
To load your saved data during live deployment and utilize it in your algorithm as a 3D object, you will need to convert it back to a NumPy array and unpack it. This can be done in initialize or called as a scheduled event.
Here is the full process:
def UnpackData(self, dataName, shapeName):
##Get the full file path by looking up the file name
mypath = self.ObjectStore.GetFilePath(dataName)
bpath = self.ObjectStore.GetFilePath(shapeName)
##load files
sto_pd_arr = pd.read_csv(mypath)
sto_pd_shape = pd.read_csv(bpath)
##Covert back to numpy
##Delete the index added during conversion
new_arr = sto_pd_arr.to_numpy()
new_arr = np.delete(new_arr,0,1)
lo_shape = sto_pd_shape.to_numpy()
lo_shape = np.delete(lo_shape,0,1)
##Reshape to original
final_arr = new_arr.reshape(new_arr.shape[0],
new_arr.shape[1]
// lo_shape[2][0],
lo_shape[2][0])
Great article btw, i prefer articles like this that focus on pain points instead of long tutorials!
The second method might work for daily resolution, but what about higher resolutions like 1 minute?
I came across this issue myself, and my idea is to have a lambda function on AWS, that prepares the historical data and dumps it into an S3 bucket, then my algorithm would download the file with historical data. What do you think?