Cryptocurrency

GSP603 |Tracking Cryptocurrency Exchange Trades with Google Cloud Platform in Real-Time #3

#GSP603
#TrackingCryptocurrencyExchangeTrades
#GoogleCloudPlatform

Blockchain and related technologies such as distributed ledger and distributed apps are becoming new value drivers and solution priorities in many industries. In this Quest you will gain hands-on experience with distributed ledger and the exploration of blockchain datasets in Google Cloud. This Quest brings the research and solution work of Google’s Allen Day into self-paced labs for you to run and learn directly. As this Quest utilizes advanced SQL in BigQuery, we’ve added a SQL-in-BigQuery refresher lab at the start. The final lab is an advanced challenge-style lab in which there are elements where you are not provided the answer but must solve it for yourself.

Overview
Today’s financial world is complex, and the old technology used for constructing financial data pipelines isn’t keeping up. With multiple financial exchanges operating around the world and global user demand, these data pipelines have to be fast, reliable and scalable.

Currently, using an econometric approach—applying models to financial data to forecast future trends—doesn’t work for real-time financial predictions. And data that’s old, inaccurate or from a single source doesn’t translate into dependable data for financial institutions to use. But building pipelines with Google Cloud Platform (GCP) can solve some of these key challenges. In this post, we’ll describe how to build a pipeline to predict financial trends in microseconds. We’ll walk through how to set up and configure a pipeline for ingesting real-time, time-series data from various financial exchanges and how to design a suitable data model, which facilitates querying and graphing at scale.

You’ll find a tutorial below on setting up and deploying the proposed architecture using GCP, particularly these products:

Cloud Dataflow for scalable data ingestion system that can handle late data
Cloud Bigtable, our scalable, low-latency time series database that’s reached 40 million transactions per second on 3500 nodes. Bonus: Scalable ML pipeline using Tensorflow eXtended, while not part of this tutorial, is a logical next step.
The tutorial will explain how to establish a connection to multiple exchanges, subscribe to their trade feed, and extract and transform these trades into a flexible format to be stored in Cloud Bigtable and be available to be graphed and analyzed.

This will also set the foundation for ML online learning predictions at scale. You’ll see how to graph the trades, volume, and time delta from trade execution until it reaches our system (an indicator of how close to real time we can get the data). You can find more details on GitHub too.

Share via