Bitcoin Price Prediction using RAPIDS : cuDF, DLPack and Keras on GPU
Disclaimer : This blog is for learning purpose to demonstrate how to use NVIDIA stack for time series data and not intended to be used to create applications or use it as an investment advice in real world scenario.
The popularity of Cryptocurrencies have skyrocketed with eminent entrepreneurs like Elon Musk backing it openly.
Though there is more volatility and unpredictability for cryptocurrencies as compared to stocks due to factors like technological progress, internal competition, pressure on the markets to deliver, lack of indexes, economic problems, security issues, political factor etc, but we can still try our luck using Deep Learning techniques. In addition to this, we will be using NVIDIA’s GPU framework very similar to Pandas i.e. cuDF and also try to run Deep Learning using Keras Tf on GPU’s.
In this blog, I will be going through a four step process to predict cryptocurrency prices:
- Data extraction using Rest API’s
- Data Preparation and pre-processing using cuDF.
- Predict the price of cryptocurrency using LSTM neural network using Keras Tf on GPU.
Data extraction
The dataset can be downloaded from the CryptoCompare website which can be found here.
The dataset contains below useful features. The details for them are as follows:
- close — It is the market close price for currency for that particular day.
- high— It is highest price of currency for the day.
- low— It is the lowest price for currency for that day.
- open— It is market open price for currency for that day.
- volumefrom— The volume from of currency that is being in trade for that day.
- volumeto —The volume to of currency that is being in trade for that day.
Code Explanation
The notebook on github can be found here.
In the Google colab GPU mode, I started with installing RAPIDS and importing all relevant libraries.
# Install RAPIDS!git clone https://github.com/rapidsai/rapidsai-csp-utils.git
!bash rapidsai-csp-utils/colab/rapids-colab.sh stable
import sys, os, shutil
sys.path.append('/usr/local/lib/python3.7/site-packages/')
os.environ['NUMBAPRO_NVVM'] = '/usr/local/cuda/nvvm/lib64/libnvvm.so'
os.environ['NUMBAPRO_LIBDEVICE'] = '/usr/local/cuda/nvvm/libdevice/'\
os.environ["CONDA_PREFIX"] = "/usr/local"
for so in ['cudf', 'rmm', 'nccl', 'cuml', 'cugraph', 'xgboost', 'cuspatial']:
fn = 'lib'+so+'.so'
source_fn = '/usr/local/lib/'+fn
dest_fn = '/usr/lib/'+fn
if os.path.exists(source_fn):
print(f'Copying {source_fn} to {dest_fn}')
shutil.copyfile(source_fn, dest_fn)# fix for BlazingSQL import issue
# ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by /usr/local/lib/python3.7/site-packages/../../libblazingsql-engine.so)
if not os.path.exists('/usr/lib64'):
os.makedirs('/usr/lib64')
for so_file in os.listdir('/usr/local/lib'):
if 'libstdc' in so_file:
shutil.copyfile('/usr/local/lib/'+so_file, '/usr/lib64/'+so_file)
shutil.copyfile('/usr/local/lib/'+so_file, '/usr/lib/x86_64-linux-gnu/'+so_file)import cudf
I have used Canadian exchange rate and stored the real time data into a cuDF data-frame. I used to_datetime()
method to convert string Date time into the Date time object. This is necessary as Date time objects in the file are read as a string object.
endpoint = 'https://min-api.cryptocompare.com/data/histoday'
res = requests.get(endpoint + '?fsym=BTC&tsym=CAD&limit=2000')
hist = cudf.DataFrame(json.loads(res.content)['Data'])
hist = hist.set_index('time')
hist.index = cudf.to_datetime(hist.index, unit='s')\
target_col = 'close'
Let’s see how the dataset looks like with all the trading features like price, volume, open, high, low.
hist.head(5)
Next, I split the data into two sets — training set and test set with 80% and 20% data respectively. The decision made here is just for the purpose of this tutorial. In real projects, you should always split your data into training, validation, testing (like 60%, 20%, 20%).
def train_test_split(df, test_size=0.2):
split_row = len(df) - int(test_size * len(df))
train_data = df.iloc[:split_row]
test_data = df.iloc[split_row:]
return train_data, test_datatrain, test = train_test_split(hist, test_size=0.2)
Now let’s plot the cryptocurrency prices in Canadian dollars as a function of time using the below code:
def line_plot(line1, line2, label1=None, label2=None, title='', lw=2):
fig, ax = plt.subplots(1, figsize=(13, 7))
ax.plot(line1, label=label1, linewidth=lw)
ax.plot(line2, label=label2, linewidth=lw)
ax.set_ylabel('price [CAD]', fontsize=14)
ax.set_title(title, fontsize=16)
ax.legend(loc='best', fontsize=16)line_plot(train[target_col], test[target_col], 'training', 'test', title='')
We can see there is no seasonal pattern as such and with crypto its hard to generalize anything.
Next, I made a couple of functions to normalize the values. Normalization is a technique often applied as part of data preparation for machine learning. The goal of normalization is to change the values of numeric columns in the dataset to a common scale, without distorting differences in the ranges of values.
def normalise_zero_base(df):
return cudf.DataFrame(df.values / df.values[0] - 1)def normalise_min_max(df):
return (df - df.min()) / (df.max() - df.min())
Next, I made a function to extract data of windows which are of size 5 each as shown in the code below:
def extract_window_data(df, window_len=5, zero_base=True):
window_data = []
for idx in range(len(df) - window_len):
tmp = df[idx: (idx + window_len)].copy()
if zero_base:
tmp = normalise_zero_base(tmp)
window_data.append(tmp.values)
return np.array(window_data)
I continued with making a function to prepare the data in a format to be later fed into the neural network. I used the same concept of splitting the data into two sets — training set and test set with 80% and 20% data respectively as shown in the code below:
def prepare_data(df, target_col, window_len=10, zero_base=True, test_size=0.2):
train_data, test_data = train_test_split(df, test_size=test_size)
X_train = extract_window_data(train_data, window_len, zero_base)
X_test = extract_window_data(test_data, window_len, zero_base)
y_train = train_data[target_col][window_len:].values
y_test = test_data[target_col][window_len:].values
if zero_base:
y_train = y_train / train_data[target_col][:-window_len].values - 1
y_test = y_test / test_data[target_col][:-window_len].values - 1 return train_data, test_data, X_train, X_test, y_train, y_test
DLPack
DLPack is an open in-memory tensor structure to for sharing tensor among frameworks. DLPack enables
- Easier sharing of operators between deep learning frameworks.
- Easier wrapping of vendor level operator implementations, allowing collaboration when introducing new devices/ops.
- Quick swapping of backend implementations, like different version of BLAS
- For final users, this could bring more operators, and possibility of mixing usage between framework
With PytTorch, Chainer and Tensorflow (experimental) we can use #DLPack and __cuda_array_interface__ to use cuDF dataframes as input in Keras without disrupting the GPU workflow.
arr_tf_xt = tf.experimental.dlpack.from_dlpack(X_train.toDlpack())
arr_tf_yt = tf.experimental.dlpack.from_dlpack(y_train.toDlpack())
Now let’s build the model. Sequential model is used for stacking all the layers (input, hidden and output). The neural network comprises of a LSTM layer followed by 20% Dropout layer and a Dense layer with linear activation function. I complied the model using Adam as the optimizer and Mean Squared Error as the loss function.
def build_lstm_model(input_data, output_size, neurons=100, activ_func='linear', dropout=0.2, loss='mse', optimizer='adam'):
model = Sequential()
model.add(LSTM(neurons, input_shape=(input_data.shape[1], input_data.shape[2])))
model.add(Dropout(dropout))
model.add(Dense(units=output_size))
model.add(Activation(activ_func))
model.compile(loss=loss, optimizer=optimizer)
return model
Next, I set up some of the parameters to be used later. These parameters are — random number seed, length of the window, test set size, number of neurons in LSTM layer, epochs, batch size, loss, dropouts and optimizer.
np.random.seed(42)
window_len = 5
test_size = 0.2
zero_base = True
lstm_neurons = 100
epochs = 20
batch_size = 32
loss = 'mse'
dropout = 0.2
optimizer = 'adam'
Now let’s train the model using inputs x_train
and labels y_train
.
train, test, X_train, X_test, y_train, y_test = prepare_data(
hist, target_col, window_len=window_len, zero_base=zero_base, test_size=test_size)model = build_lstm_model(
X_train, output_size=1, neurons=lstm_neurons, dropout=dropout, loss=loss,
optimizer=optimizer)
history = model.fit(
X_train, y_train, epochs=epochs, batch_size=batch_size, verbose=1, shuffle=True)
Let us take a look at snapshot during model training for 20 epochs.
Training of the neural network
I used Mean Absolute Error (MAE) as the evaluation metric.
The MAE value obtained looks good. Finally, let’s plot the actual and predicted prices using the below code:
targets = test[target_col][window_len:]
preds = model.predict(arr_tf_xtest).squeeze()
mean_absolute_error(preds, arr_tf_ytest)0.028300751124918887preds = testcd[target_col].values[:-window_len] * (preds + 1)
preds = cudf.Series(index=targets.index, data=preds)
predscd = preds.to_pandas()
line_plot(targetscd, predscd , 'actual', 'prediction', lw=3)
However, do not expect this accuracy in real world scenario due to several factors like not split and used validation set, not considered volatility of bitcoin and overall new/market sentiment on bitcoin / crypto currencies.
Conclusion
In this article, I demonstrated how to predict cryptocurrency prices in real time using LSTM neural network on GPU especially using RAPIDS and DLPack.
Feel free to play with the hyper-parameters or try out different neural network architectures for better results.
This is for a small dataset hence we did not compare the performance, for larger datasets GPU will be a clear winner with RAPIDS cuDF accelerating in order of magnitude of double digits with a lesser TCO.
Disclaimer : This blog is for learning purpose to demonstrate how to use NVIDIA stack for time series data and not intended to be used to create applications or use it as an investment advice in real world scenario.