05-24-2018, 03:34 AM
Pandas is a very powerful Python library that you must master if you want to use Python in algorithmic trading. Pandas provide you with the ability to slice and dice a dataframe in many ways that can make your job very easy. Pandas is written in C. It is very fast. Dataframe is a very important concept. Dataframe is basically a table with many rows and columns. Each column represents an attribute. Each row is a record.
When developing algorithmic trading strategies, you will have to deal with dataframes a lot. Reading a dataframe is very easy with Pandas. I have written the following python code that you can use to read the data csv files into a dataframe. I define a function that reads the csv file. We just specify the currency pair and the timeframe. CSV file should be saved on your hard drive. If you have Meta Trader 4 installed on your computer, you can download the csv files for different currency pairs for different timeframes from its history center:
# Data fetching
def get_data(currency_pair, timeframe):
link='D:/Shared/MarketData/{}{}.csv'.format(currency_pair,\
timeframe)
data1 = pd.read_csv(link, header=None)
data1.columns=['Date', 'Time', 'Open', 'High', 'Low',
'Close', 'Volume']
# We need to merge the data and time columns
# convert that column into datetime object
data1['Datetime'] = pd.to_datetime(data1['Date'] \
+ ' ' + data1['Time'])
#rearrange the columnss with Datetime the first
data1=data1[['Datetime', 'Open', 'High',
'Low', 'Close', 'Volume']]
#set Datetime column as index
data1 = data1.set_index('Datetime')
return(data1)
Above function can read the csv files saved on your hard drive.
df = get_data('GBPUSD', 1440)
df.shape
df.head()
Out[12]:
Open High Low Close Volume
Datetime
2011-05-19 1.6156 1.6241 1.6129 1.6227 12390
2011-05-20 1.6226 1.6303 1.6165 1.6229 11879
2011-05-22 1.6224 1.6228 1.6209 1.6227 663
2011-05-23 1.6226 1.6232 1.6057 1.6072 12852
2011-05-24 1.6073 1.6207 1.6067 1.6177 12635
You can see the format of the dataframe that Python has read. Datetime is the index. While defining the function, we have converted date and time into a datetime column that is picked up by Pandas and used as an index. Below is the command that you can use to check if datetime is the index or not:
isinstance(df.index, pd.DatetimeIndex)
Resampling the Dataframe
With pandas we have the ability to resample the dataframe into different frequencies mostly higher. For example, you have read the GBPUSD 15 minute Open, High, Low, Close OHLC data. We can use pandas to easily resample that dataframe into a 30 minute OHLC data, 60 minute OHLC data, 240 minute, 1440 minute and more. First we need to define a dictionary.
ohlc_dict = {
'Open':'first',
'High':'max',
'Low':'min',
'Close':'last',
'Volume':'sum'
}
Now we can easily resample the dataframe with the following commands:
#resample the intraday OHLC into Daily OHLC
df240Mn=df.resample('240min').agg(ohlc_dict).dropna()
dfDaily=df.resample('1D').agg(ohlc_dict).dropna()
dfDaily.head()
#resample the intraday OHLC into daily OHLC
dfWeekly=df.resample('W-Fri').agg(ohlc_dict).dropna()
dfWeekly.head()
Below we read GBPUSD 15 minute data:
Out[16]:
Open High Low Close Volume
Datetime
2017-11-23 14:00:00 1.33113 1.33147 1.33066 1.33078 716
2017-11-23 14:15:00 1.33080 1.33110 1.33058 1.33085 620
2017-11-23 14:30:00 1.33087 1.33113 1.33070 1.33094 476
2017-11-23 14:45:00 1.33103 1.33149 1.33038 1.33046 273
2017-11-23 15:00:00 1.33045 1.33065 1.32988 1.33042 351
Using the above code we easily converted this GBPUSD 15 minute data into GBPUSD daily data:
Out[19]:
Open High Low Close Volume
Datetime
2017-11-23 1.33113 1.33149 1.32913 1.33045 11292.0
2017-11-24 1.33046 1.33588 1.32777 1.33365 36115.0
2017-11-26 1.33224 1.33314 1.33170 1.33229 1550.0
2017-11-27 1.33228 1.33820 1.33044 1.33199 36031.0
2017-11-28 1.33200 1.33860 1.32199 1.33634 51388.0
Everything is done by pandas internally. More on pandas tomorrow so stay tuned!
When developing algorithmic trading strategies, you will have to deal with dataframes a lot. Reading a dataframe is very easy with Pandas. I have written the following python code that you can use to read the data csv files into a dataframe. I define a function that reads the csv file. We just specify the currency pair and the timeframe. CSV file should be saved on your hard drive. If you have Meta Trader 4 installed on your computer, you can download the csv files for different currency pairs for different timeframes from its history center:
# Data fetching
def get_data(currency_pair, timeframe):
link='D:/Shared/MarketData/{}{}.csv'.format(currency_pair,\
timeframe)
data1 = pd.read_csv(link, header=None)
data1.columns=['Date', 'Time', 'Open', 'High', 'Low',
'Close', 'Volume']
# We need to merge the data and time columns
# convert that column into datetime object
data1['Datetime'] = pd.to_datetime(data1['Date'] \
+ ' ' + data1['Time'])
#rearrange the columnss with Datetime the first
data1=data1[['Datetime', 'Open', 'High',
'Low', 'Close', 'Volume']]
#set Datetime column as index
data1 = data1.set_index('Datetime')
return(data1)
Above function can read the csv files saved on your hard drive.
df = get_data('GBPUSD', 1440)
df.shape
df.head()
Out[12]:
Open High Low Close Volume
Datetime
2011-05-19 1.6156 1.6241 1.6129 1.6227 12390
2011-05-20 1.6226 1.6303 1.6165 1.6229 11879
2011-05-22 1.6224 1.6228 1.6209 1.6227 663
2011-05-23 1.6226 1.6232 1.6057 1.6072 12852
2011-05-24 1.6073 1.6207 1.6067 1.6177 12635
You can see the format of the dataframe that Python has read. Datetime is the index. While defining the function, we have converted date and time into a datetime column that is picked up by Pandas and used as an index. Below is the command that you can use to check if datetime is the index or not:
isinstance(df.index, pd.DatetimeIndex)
Resampling the Dataframe
With pandas we have the ability to resample the dataframe into different frequencies mostly higher. For example, you have read the GBPUSD 15 minute Open, High, Low, Close OHLC data. We can use pandas to easily resample that dataframe into a 30 minute OHLC data, 60 minute OHLC data, 240 minute, 1440 minute and more. First we need to define a dictionary.
ohlc_dict = {
'Open':'first',
'High':'max',
'Low':'min',
'Close':'last',
'Volume':'sum'
}
Now we can easily resample the dataframe with the following commands:
#resample the intraday OHLC into Daily OHLC
df240Mn=df.resample('240min').agg(ohlc_dict).dropna()
dfDaily=df.resample('1D').agg(ohlc_dict).dropna()
dfDaily.head()
#resample the intraday OHLC into daily OHLC
dfWeekly=df.resample('W-Fri').agg(ohlc_dict).dropna()
dfWeekly.head()
Below we read GBPUSD 15 minute data:
Out[16]:
Open High Low Close Volume
Datetime
2017-11-23 14:00:00 1.33113 1.33147 1.33066 1.33078 716
2017-11-23 14:15:00 1.33080 1.33110 1.33058 1.33085 620
2017-11-23 14:30:00 1.33087 1.33113 1.33070 1.33094 476
2017-11-23 14:45:00 1.33103 1.33149 1.33038 1.33046 273
2017-11-23 15:00:00 1.33045 1.33065 1.32988 1.33042 351
Using the above code we easily converted this GBPUSD 15 minute data into GBPUSD daily data:
Out[19]:
Open High Low Close Volume
Datetime
2017-11-23 1.33113 1.33149 1.32913 1.33045 11292.0
2017-11-24 1.33046 1.33588 1.32777 1.33365 36115.0
2017-11-26 1.33224 1.33314 1.33170 1.33229 1550.0
2017-11-27 1.33228 1.33820 1.33044 1.33199 36031.0
2017-11-28 1.33200 1.33860 1.32199 1.33634 51388.0
Everything is done by pandas internally. More on pandas tomorrow so stay tuned!
Subscribe My YouTube Channel:
https://www.youtube.com/channel/UCUE7VPo...F_BCoxFXIw
Join Our Million Dollar Trading Challenge:
https://www.doubledoji.com/million-dolla...challenge/