This repository shows how to
- fetch real-time trade data (aka raw data) from the Coinbase Websocket API
- transform trade data into OHLC data (aka features) in real-time using Bytewax, and
- store these features in a serverless Feature Store like Hopsworks.
This repository is a natural continuation of this previous project where we built a Streamlit app with real-time feature engineering, but lacked state persistence: after each re-load of the Streamlit app, we lost all features generated up to that point.
In this project we add state to our system through a a Feature Store. We use Hopsworks because
- it is serverless, so we do not need to handle infrastructure
- it has a very generous free tier, with up to 25GB of free storage.
-
Create a Python virtual environment with the project dependencies with
$ make init -
Set your Hopsworks project name and API key as environment variables by running the following script (to generate these head to hopsworks.ai, create a free account, create a project and generate an API key for free)
$ . ./set_environment_variables.sh -
To run the feature pipeline locally
$ make run -
To deploy the feature pipeline on an AWS EC2 instance you first need to have an AWS account and the
aws-clitool installed in your local system. Then run the following command to deploy your feature pipeline on an EC2 instance$ make deploy -
Feature pipeline logs are send to AWS CloudWatch. Run the following command to grab the URL where you can see the logs.
$ make list -
To shutdown the feature pipeline on AWS and free resources run
$ make delete
ℹ️ Implementation details
Check the Real-World ML Program, a hands-on, 3-hour course where you will learn how to design, build, deploy, and monitor complete ML products.
