├── .gitignore ├── README.md ├── agent ├── README.md ├── __init__.py └── dqn.py ├── configurations.py ├── data_recorder ├── README.md ├── __init__.py ├── bitfinex_connector │ ├── README.md │ ├── __init__.py │ ├── bitfinex_book.py │ ├── bitfinex_client.py │ └── bitfinex_orderbook.py ├── coinbase_connector │ ├── README.md │ ├── __init__.py │ ├── coinbase_book.py │ ├── coinbase_client.py │ └── coinbase_orderbook.py ├── connector_components │ ├── README.md │ ├── __init__.py │ ├── book.py │ ├── client.py │ ├── orderbook.py │ ├── price_level.py │ └── trade_tracker.py ├── database │ ├── README.md │ ├── __init__.py │ ├── data_exports │ │ └── demo_LTC-USD_20190926.csv.xz │ ├── database.py │ ├── simulator.py │ └── viz.py └── tests │ ├── test_bitfinex_client.py │ ├── test_coinbase_client.py │ └── test_simulator.py ├── design_patterns ├── design-pattern-high-level.PNG ├── design-pattern.png ├── plot_lob_levels.png ├── plot_lob_overlay.png ├── plot_order_arrivals.png └── plot_transactions.png ├── experiment.py ├── gym_trading ├── README.md ├── __init__.py ├── envs │ ├── README.md │ ├── __init__.py │ ├── base_environment.py │ ├── market_maker.py │ └── trend_following.py ├── tests │ ├── test_broker.py │ ├── test_market_maker.py │ └── test_trend_following.py └── utils │ ├── __init__.py │ ├── broker.py │ ├── data_pipeline.py │ ├── decorator.py │ ├── order.py │ ├── plot_history.py │ ├── position.py │ ├── render_env.py │ ├── reward.py │ └── statistic.py ├── indicators ├── __init__.py ├── ema.py ├── indicator.py ├── rsi.py ├── tests │ └── test_indicators.py └── tns.py ├── recorder.py ├── requirements.txt └── setup.py /.gitignore: -------------------------------------------------------------------------------- 1 | venv 2 | 3 | *.csv 4 | 5 | .idea 6 | 7 | build 8 | 9 | dist 10 | 11 | .pytest_cache 12 | 13 | agent/dqn_weights/* 14 | agent/ppo_weights/* 15 | agent/acer_weights/* 16 | agent/a2c_weights/* 17 | 18 | *.egg* 19 | 20 | *.ipynb 21 | .ipynb_checkpoints 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Deep Reinforcement Learning Toolkit for Cryptocurrencies 2 | 3 | 4 | **Table of contents:** 5 | 6 | 1. Purpose 7 | 2. Scope 8 | 3. Dependencies 9 | 4. Project structure 10 | 5. Design patterns 11 | 6. Getting started 12 | 7. Citing this project 13 | 8. Appendix 14 | 15 | ## 1. Purpose 16 | The purpose of this application is to provide a toolkit to: 17 | - **Record** full limit order book and trade tick data from two 18 | exchanges (**Coinbase Pro** and **Bitfinex**) into an [Arctic](https://github.com/manahl/arctic) 19 | Tickstore database (i.e., MongoDB), 20 | - **Replay** recorded historical data to derive feature sets for training 21 | - **Train** an agent to trade cryptocurrencies using the DQN algorithm (note: this agent 22 | implementation is intended to be an example for users to reference) 23 | 24 | ![High_Level_Overview](./design_patterns/design-pattern-high-level.PNG) 25 | 26 | 27 | ## 2. Scope 28 | - **Research only:** there is no capability for live-trading at exchanges. 29 | - **Reproducing RL paper results:** the dataset used for [this](https://arxiv.org/abs/2004.06985) article is available on [Kaggle Datasets](https://www.kaggle.com/jsadighian/cryptorl). 30 | 31 | 32 | ## 3. Dependencies 33 | See `requirements.txt` 34 | 35 | *Note*: to run and train the DQN Agent (`./agent/dqn.py`) tensorflow and Keras-RL 36 | need to be installed manually and are not listed in the `requirements.txt` 37 | in order to keep this project compatible with other open 38 | sourced reinforcement learning platforms 39 | (e.g., [OpenAI Baselines](https://github.com/openai/baselines)). 40 | 41 | Pip install the following: 42 | 43 | ``` 44 | git+https://github.com/manahl/arctic.git 45 | 46 | Keras==2.2.4 47 | Keras-Applications==1.0.7 48 | Keras-Preprocessing==1.0.9 49 | keras-rl==0.4.2 50 | 51 | tensorboard==1.13.1 52 | tensorflow-estimator==1.13.0 53 | tensorflow-gpu==1.13.1 54 | ``` 55 | 56 | 57 | ## 4. Project Structure 58 | The key elements in this project and brief descriptions. 59 | ``` 60 | crypto-rl/ 61 | agent/ 62 | ...reinforcement learning algorithm implementations 63 | data_recorder/ 64 | ...tools to connect, download, and retrieve limit order book data 65 | gym_trading/ 66 | ...extended openai.gym environment to observe limit order book data 67 | indicators/ 68 | ...technical indicators implemented to be O(1) time complexity 69 | design-patterns/ 70 | ...visual diagrams module architecture 71 | venv/ 72 | ...virtual environment for local deployments 73 | experiment.py # Entry point for running reinforcement learning experiments 74 | recorder.py # Entry point to start recording limit order book data 75 | configurations.py # Constants used throughout this project 76 | requirements.txt # List of project dependencies 77 | setup.py # Run the command `python3 setup.py install` to 78 | # install the extended gym environment i.e., gym_trading.py 79 | ``` 80 | 81 | 82 | ## 5. Design Patterns 83 | Refer to each individual module for design pattern specifications: 84 | 85 | - [Limit Order Book, Data Recorder, and Database](./data_recorder/README.md) 86 | - [Stationary LOB Features](https://arxiv.org/abs/1810.09965v1) 87 | - [POMDP Environment](./gym_trading/README.md) 88 | - [Learning Algorithms and Neural Networks](./agent/README.md) 89 | 90 | Sample snapshot of Limit Order Book levels: 91 | ![plot_lob_levels](./design_patterns/plot_lob_levels.png) 92 | 93 | Sample snapshot of Order Arrival flow metrics: 94 | ![plot_order_arrivals](./design_patterns/plot_order_arrivals.png) 95 | 96 | 97 | ## 6. Getting Started 98 | 99 | Install the project on your machine: 100 | ``` 101 | # Clone the project from github 102 | git clone https://github.com/sadighian/crypto-rl.git 103 | cd crypto-rl 104 | 105 | # Install a virtual environment for the project's dependencies 106 | python3 -m venv ./venv 107 | 108 | # Turn on the virtual environment 109 | source venv/bin/activate 110 | 111 | # Install keras-rl dependencies 112 | pip3 install Keras==2.2.4 Keras-Applications==1.0.7 Keras-Preprocessing==1.0.9 keras-rl==0.4.2 113 | tensorboard==1.13.1 tensorflow-estimator==1.13.0 tensorflow-gpu==1.13.1 114 | 115 | # Install database 116 | pip3 install git+https://github.com/manahl/arctic.git 117 | 118 | # Install the project 119 | pip3 install -e . 120 | ``` 121 | 122 | ### 6.1 Record limit order book data from exchanges 123 | 124 | **Step 1:** 125 | Go to the `configurations.py` and define the crypto currencies which 126 | you would like to subscribe and record. 127 | 128 | Note: basket list format is as follows `[(Coinbase_Instrument_Name, Bitfinex_Instrument_Name), ...]` 129 | ``` 130 | SNAPSHOT_RATE = 5 # I.e., every 5 seconds 131 | BASKET = [('BTC-USD', 'tBTCUSD'), 132 | ('ETH-USD', 'tETHUSD'), 133 | ('LTC-USD', 'tLTCUSD'), 134 | ('BCH-USD', 'tBCHUSD'), 135 | ('ETC-USD', 'tETCUSD')] 136 | RECORD_DATA = True 137 | ``` 138 | 139 | **Step 2:** 140 | Open a CLI/terminal and execute the command to start recording 141 | full limit order book and trade data. 142 | ``` 143 | python3 recorder.py 144 | ``` 145 | 146 | ### 6.2 Replay recorded data to export stationary feature set 147 | 148 | **Step 1:** 149 | Ensure that you have data in your database. 150 | 151 | Check with MongoDB shell or [Compass](https://www.mongodb.com/products/compass). 152 | If you do not have data, see refer to the section above 153 | **6.1 Record limit order book data from exchanges**. 154 | 155 | **Step 2:** 156 | Run a historial data simulation to take snapshots of the 157 | limit order book(s) and export their stationary features 158 | to a compressed csv. 159 | 160 | To do this, you can leverage the test cases in `data_recorder/tests/` 161 | or write your own logic. When using the test case methods, make sure 162 | to change the query parameters to match what you've actually recorded and 163 | is in your database. 164 | 165 | Example to export features to a compressed csv: 166 | ``` 167 | python3 data_recorder/tests/test_extract_features.py 168 | ``` 169 | 170 | ### 6.3 Train an agent 171 | 172 | **Step 1:** 173 | Ensure you have data in the `data_recorder/database/data_exports/` folder. 174 | This is where the agent loads data from. If you do not have data exported 175 | into that folder, see refer to the section above 176 | **6.2 Replay recorded data to export stationary feature set**. 177 | 178 | **Step 2:** 179 | Open a CLI/terminal and start learning/training the agent. 180 | ``` 181 | python3 experiment.py --window_size=50 --weights=False --fitting_file=... 182 | ``` 183 | Refer to `experiment.py` to see all the keyword arguments. 184 | 185 | 186 | ## 7. Citing this project 187 | 188 | Please remember to cite this repository if used in your research: 189 | ``` 190 | @misc{Crypto-RL, 191 | author = {Jonathan Sadighian}, 192 | title = {Deep Reinforcement Learning Toolkit for Cryptocurrencies}, 193 | year = {2019}, 194 | publisher = {GitHub}, 195 | journal = {GitHub repository}, 196 | howpublished = {\url{https://github.com/sadighian/crypto-rl}}, 197 | } 198 | ``` 199 | 200 | 201 | ## 8. Appendix 202 | ### 8.1 Branches 203 | There are multiple branches of this project, each with a different implementation pattern 204 | for persisting data: 205 | - **FULL** branch is intended to be the foundation for a fully automated trading system 206 | (i.e., implementation of design patterns that are ideal for a trading system that requires 207 | parallel processing) and persists streaming tick data into an **Arctic Tick Store** 208 | 209 | **Note:** the branches below (i.e., lightweight, order book snapshot, mongo integration) 210 | are no longer actively maintained as of October 2018, and are here for reference. 211 | 212 | - **LIGHT WEIGHT** branch is intended to record streaming data more efficiently than 213 | the __full__ branch (i.e., all websocket connections are made from a single process 214 | __and__ the limit order book is not maintained) and persists streaming tick data into 215 | an **Arctic tick store** 216 | - **ORDER BOOK SNAPSHOT** branch has the same design pattern as the __full__ branch, 217 | but instead of recording streaming ticks, snapshots of the limit order book are taken 218 | every **N** seconds and persisted into an **Arctic tick store** 219 | - **MONGO INTEGRATION** branch is the same implementation as **ORDER BOOK SNAPSHOT**, 220 | with the difference being a standard MongoDB is used, rather than Arctic. 221 | This branch was originally used to benchmark Arctic's performance and is not up to 222 | date with the **FULL** branch. 223 | 224 | ### 8.2 Assumptions 225 | - You have installed a virtual environment and installed the project to that venv 226 | (e.g., `pip3 install -e .`) 227 | - You have mongoDB already installed 228 | - You know how to use a cli to start python scripts 229 | - You are running an ubuntu 18+ os 230 | 231 | ### 8.3 Change Log 232 | - 2021-09-25: Updated `requirements.txt`: going forward the database requires a manual 233 | installation via `pip install git+https://github.com/manahl/arctic.git` 234 | - 2019-12-12: Added docstrings and refactored many classes to improve code readability 235 | - 2019-09-18: Refactored `env`s and `broker`s for simplification and 236 | added different `reward` approaches. 237 | - 2019-09-13: Created and implemented 'order arrival' flow metrics, 238 | inspired by 239 | [Multi-Level Order-Flow Imbalance in a Limit Order Book](https://arxiv.org/abs/1907.06230v1) 240 | by Xu, Ke; Gould, Martin D.; Howison, Sam D. 241 | - 2019-09-06: Created and implemented `Indicator.py` base class 242 | - 2019-04-28: Reorganized project structure for simplicity 243 | -------------------------------------------------------------------------------- /agent/README.md: -------------------------------------------------------------------------------- 1 | # Agent 2 | As of September 18, 2019. 3 | 4 | ## Overview 5 | Agents are implemented by wrapping Keras-RL API. Each file in 6 | this directory is a different type of reinforcement learning algo. 7 | 8 | ## DQN Architecture 9 | - Dueling architecture 10 | - Double Q-learning 11 | - Experience replay 12 | 13 | ## Neural Networks 14 | CNN 15 | - 3x convolutional layers 16 | - 1x dense mlp 17 | 18 | MLP 19 | - 2x Dense mlp -------------------------------------------------------------------------------- /agent/__init__.py: -------------------------------------------------------------------------------- 1 | from agent.dqn import DQNAgent 2 | -------------------------------------------------------------------------------- /agent/dqn.py: -------------------------------------------------------------------------------- 1 | from keras.models import Sequential 2 | from keras.layers import Dense, Flatten, Conv2D 3 | from keras.optimizers import Adam 4 | from rl.agents.dqn import DQNAgent 5 | from rl.memory import SequentialMemory 6 | from rl.callbacks import FileLogger, ModelIntervalCheckpoint 7 | from configurations import LOGGER 8 | import os 9 | import gym 10 | import gym_trading 11 | 12 | 13 | class Agent(object): 14 | name = 'DQN' 15 | 16 | def __init__(self, number_of_training_steps=1e5, gamma=0.999, load_weights=False, 17 | visualize=False, dueling_network=True, double_dqn=True, nn_type='mlp', 18 | **kwargs): 19 | """ 20 | Agent constructor 21 | :param window_size: int, number of lags to include in observation 22 | :param max_position: int, maximum number of positions able to be held in inventory 23 | :param fitting_file: str, file used for z-score fitting 24 | :param testing_file: str,file used for dqn experiment 25 | :param env: environment name 26 | :param seed: int, random seed number 27 | :param action_repeats: int, number of steps to take in environment between actions 28 | :param number_of_training_steps: int, number of steps to train agent for 29 | :param gamma: float, value between 0 and 1 used to discount future DQN returns 30 | :param format_3d: boolean, format observation as matrix or tensor 31 | :param train: boolean, train or test agent 32 | :param load_weights: boolean, import existing weights 33 | :param z_score: boolean, standardize observation space 34 | :param visualize: boolean, visualize environment 35 | :param dueling_network: boolean, use dueling network architecture 36 | :param double_dqn: boolean, use double DQN for Q-value approximation 37 | """ 38 | # Agent arguments 39 | # self.env_name = id 40 | self.neural_network_type = nn_type 41 | self.load_weights = load_weights 42 | self.number_of_training_steps = number_of_training_steps 43 | self.visualize = visualize 44 | 45 | # Create environment 46 | self.env = gym.make(**kwargs) 47 | self.env_name = self.env.env.id 48 | 49 | # Create agent 50 | # NOTE: 'Keras-RL' uses its own frame-stacker 51 | self.memory_frame_stack = 1 # Number of frames to stack e.g., 1. 52 | self.model = self.create_model(name=self.neural_network_type) 53 | self.memory = SequentialMemory(limit=10000, 54 | window_length=self.memory_frame_stack) 55 | self.train = self.env.env.training 56 | self.cwd = os.path.dirname(os.path.realpath(__file__)) 57 | 58 | # create the agent 59 | self.agent = DQNAgent(model=self.model, 60 | nb_actions=self.env.action_space.n, 61 | memory=self.memory, 62 | processor=None, 63 | nb_steps_warmup=500, 64 | enable_dueling_network=dueling_network, 65 | dueling_type='avg', 66 | enable_double_dqn=double_dqn, 67 | gamma=gamma, 68 | target_model_update=1000, 69 | delta_clip=1.0) 70 | self.agent.compile(Adam(lr=float("3e-4")), metrics=['mae']) 71 | 72 | def __str__(self): 73 | # msg = '\n' 74 | # return msg.join(['{}={}'.format(k, v) for k, v in self.__dict__.items()]) 75 | return 'Agent = {} | env = {} | number_of_training_steps = {}'.format( 76 | Agent.name, self.env_name, self.number_of_training_steps) 77 | 78 | def create_model(self, name: str = 'cnn') -> Sequential: 79 | """ 80 | Helper function get create and get the default MLP or CNN model. 81 | 82 | :param name: Neural network type ['mlp' or 'cnn'] 83 | :return: neural network 84 | """ 85 | LOGGER.info("creating model for {}".format(name)) 86 | if name == 'cnn': 87 | return self._create_cnn_model() 88 | elif name == 'mlp': 89 | return self._create_mlp_model() 90 | 91 | def _create_cnn_model(self) -> Sequential: 92 | """ 93 | Create a Convolutional neural network with dense layer at the end. 94 | 95 | :return: keras model 96 | """ 97 | features_shape = (self.memory_frame_stack, *self.env.observation_space.shape) 98 | model = Sequential() 99 | conv = Conv2D 100 | model.add(conv(input_shape=features_shape, 101 | filters=5, kernel_size=[10, 1], padding='same', activation='relu', 102 | strides=[5, 1], data_format='channels_first')) 103 | model.add(conv(filters=5, kernel_size=[5, 1], padding='same', activation='relu', 104 | strides=[2, 1], data_format='channels_first')) 105 | model.add(conv(filters=5, kernel_size=[4, 1], padding='same', activation='relu', 106 | strides=[2, 1], data_format='channels_first')) 107 | model.add(Flatten()) 108 | model.add(Dense(256, activation='relu')) 109 | model.add(Dense(self.env.action_space.n, activation='softmax')) 110 | LOGGER.info(model.summary()) 111 | return model 112 | 113 | def _create_mlp_model(self) -> Sequential: 114 | """ 115 | Create a DENSE neural network with dense layer at the end 116 | 117 | :return: keras model 118 | """ 119 | features_shape = (self.memory_frame_stack, *self.env.observation_space.shape) 120 | model = Sequential() 121 | model.add(Dense(units=256, input_shape=features_shape, activation='relu')) 122 | model.add(Dense(units=256, activation='relu')) 123 | model.add(Flatten()) 124 | model.add(Dense(self.env.action_space.n, activation='softmax')) 125 | LOGGER.info(model.summary()) 126 | return model 127 | 128 | def start(self) -> None: 129 | """ 130 | Entry point for agent training and testing 131 | 132 | :return: (void) 133 | """ 134 | output_directory = os.path.join(self.cwd, 'dqn_weights') 135 | if not os.path.exists(output_directory): 136 | LOGGER.info('{} does not exist. Creating Directory.'.format(output_directory)) 137 | os.mkdir(output_directory) 138 | 139 | weight_name = 'dqn_{}_{}_weights.h5f'.format( 140 | self.env_name, self.neural_network_type) 141 | weights_filename = os.path.join(output_directory, weight_name) 142 | LOGGER.info("weights_filename: {}".format(weights_filename)) 143 | 144 | if self.load_weights: 145 | LOGGER.info('...loading weights for {} from\n{}'.format( 146 | self.env_name, weights_filename)) 147 | self.agent.load_weights(weights_filename) 148 | 149 | if self.train: 150 | step_chkpt = '{step}.h5f' 151 | step_chkpt = 'dqn_{}_weights_{}'.format(self.env_name, step_chkpt) 152 | checkpoint_weights_filename = os.path.join(self.cwd, 153 | 'dqn_weights', 154 | step_chkpt) 155 | LOGGER.info("checkpoint_weights_filename: {}".format( 156 | checkpoint_weights_filename)) 157 | log_filename = os.path.join(self.cwd, 'dqn_weights', 158 | 'dqn_{}_log.json'.format(self.env_name)) 159 | LOGGER.info('log_filename: {}'.format(log_filename)) 160 | 161 | callbacks = [ModelIntervalCheckpoint(checkpoint_weights_filename, 162 | interval=250000)] 163 | callbacks += [FileLogger(log_filename, interval=100)] 164 | 165 | LOGGER.info('Starting training...') 166 | self.agent.fit(self.env, 167 | callbacks=callbacks, 168 | nb_steps=self.number_of_training_steps, 169 | log_interval=10000, 170 | verbose=0, 171 | visualize=self.visualize) 172 | LOGGER.info("training over.") 173 | LOGGER.info('Saving AGENT weights...') 174 | self.agent.save_weights(weights_filename, overwrite=True) 175 | LOGGER.info("AGENT weights saved.") 176 | else: 177 | LOGGER.info('Starting TEST...') 178 | self.agent.test(self.env, nb_episodes=2, visualize=self.visualize) 179 | -------------------------------------------------------------------------------- /configurations.py: -------------------------------------------------------------------------------- 1 | import logging 2 | import os 3 | 4 | import pytz as tz 5 | 6 | # singleton for logging 7 | logging.basicConfig(level=logging.INFO, format='[%(asctime)s] %(message)s') 8 | LOGGER = logging.getLogger('crypto_rl_log') 9 | 10 | # ./recorder.py 11 | SNAPSHOT_RATE = 1.0 # For example, 0.25 = 4x per second 12 | BASKET = [('BTC-USD', 'tBTCUSD'), 13 | # ('ETH-USD', 'tETHUSD'), 14 | # ('LTC-USD', 'tLTCUSD') 15 | ] 16 | 17 | # ./data_recorder/connector_components/client.py 18 | COINBASE_ENDPOINT = 'wss://ws-feed.pro.coinbase.com' 19 | COINBASE_BOOK_ENDPOINT = 'https://api.pro.coinbase.com/products/%s/book' 20 | BITFINEX_ENDPOINT = 'wss://api.bitfinex.com/ws/2' 21 | MAX_RECONNECTION_ATTEMPTS = 100 22 | 23 | # ./data_recorder/connector_components/book.py 24 | MAX_BOOK_ROWS = 15 25 | INCLUDE_ORDERFLOW = True 26 | 27 | # ./data_recorder/database/database.py 28 | BATCH_SIZE = 100000 29 | RECORD_DATA = False 30 | MONGO_ENDPOINT = 'localhost' 31 | ARCTIC_NAME = 'crypto.tickstore' 32 | TIMEZONE = tz.utc 33 | 34 | # ./data_recorder/database/simulator.py 35 | SNAPSHOT_RATE_IN_MICROSECONDS = 1000000 # 1 second 36 | 37 | # ./gym_trading/utils/broker.py 38 | MARKET_ORDER_FEE = 0.0020 39 | LIMIT_ORDER_FEE = 0.0 40 | SLIPPAGE = 0.0005 41 | 42 | # ./indicators/* 43 | INDICATOR_WINDOW = [60 * i for i in [5, 15]] # Convert minutes to seconds 44 | INDICATOR_WINDOW_MAX = max(INDICATOR_WINDOW) 45 | INDICATOR_WINDOW_FEATURES = [f'_{i}' for i in [5, 15]] # Create labels 46 | EMA_ALPHA = 0.99 # [0.9, 0.99, 0.999, 0.9999] 47 | 48 | # agent penalty configs 49 | ENCOURAGEMENT = 0.000000000001 50 | 51 | # Data Directory 52 | ROOT_PATH = os.path.dirname(os.path.realpath(__file__)) 53 | DATA_PATH = os.path.join(ROOT_PATH, 'data_recorder', 'database', 'data_exports') 54 | -------------------------------------------------------------------------------- /data_recorder/README.md: -------------------------------------------------------------------------------- 1 | # Data Recorder 2 | As of April 28th, 2019. 3 | 4 | (`recorder.py`) The design pattern is intended to serve as a 5 | foundation for implementing a trading strategy. 6 | 7 | 8 | ## 1. Recorder Architecture 9 | - Each crypto pair (e.g., Bitcoin-USD) runs on its own `Process` 10 | - Each exchange data feed is processed in its own `Thread` within the 11 | parent crypto pair `Process` 12 | - A timer for periodic polling (or order book snapshots--see 13 | `mongo-integration` or `arctic-book-snapshot` branch) runs on 14 | a separate thread 15 | 16 | ![plot_order_arrivals](../design_patterns/design-pattern.png) 17 | 18 | ## 2. Tick Store Data Model 19 | **Arctic tick store** is the database implementation of choice for 20 | this project for the 21 | following reasons: 22 | - ManAHL created and open sourced 23 | - Superior performance metrics (e.g., 10x data compression) 24 | 25 | The **Arctic Tick Store** data model is essentially a `list` of `dict`s, where 26 | each `dict` is an incoming **tick** from the exchanges. 27 | - Each `list` consists of `configurations/configs.BATCH_SIZE` ticks 28 | (e.g., 100,000 ticks) 29 | - Per the Arctic Tick Store design, all currency pairs are stored 30 | in the **same** MongoDB collection 31 | 32 | ## 3. Limit Order Book Implementation 33 | **SortedDict** pure python class is used for the limit order book 34 | for the following reasons: 35 | - Sorted Price **Insertions** within the limit order book 36 | can be performed with **O(log n)** 37 | - Price **Deletions** within the limit order book can be performed 38 | with **O(log n)** 39 | - **Getting / setting** values are performed with **O(1)** 40 | - **SortedDict** interface is intuitive, thus making implementation easier 41 | 42 | ![plot_order_arrivals](../design_patterns/plot_order_arrivals.png) -------------------------------------------------------------------------------- /data_recorder/__init__.py: -------------------------------------------------------------------------------- 1 | from data_recorder.bitfinex_connector import * 2 | from data_recorder.coinbase_connector import * 3 | from data_recorder.database import * 4 | -------------------------------------------------------------------------------- /data_recorder/bitfinex_connector/README.md: -------------------------------------------------------------------------------- 1 | # Bitfinex Connector 2 | As of March 04, 2019. 3 | 4 | ## 1. Overview 5 | The Bitfinex connector consists of three classes: 6 | 1. `bitfinex_book.py` which is the Bitfinex implementation of `./connector_components/book.py` 7 | 2. `bitfinex_orderbook.py` which is the Bitfinex implementation of `./connector_components/orderbook.py` 8 | 3. `bitfinex_client.py` which is the Bitfinex implementation of `./connector_components/client.py` 9 | 10 | ## 2. Subscriptions 11 | - WebSocket connections are made asynchronously with the `websockets` module 12 | - Raw order book & trades data subscriptions are made by using the `subscribe()` method 13 | from `bitfinex_client.py` 14 | - All websocket subscriptions pass incoming messages into a `multiprocessing.Queue()` and 15 | to be processed by a separate thread 16 | 17 | ## 3. Data Consumption Rules 18 | 1. Normalize incoming data messages from strings to numbers, such as `floats()` 19 | 2. Pass normalized messages to the `orderbook.new_tick()` method to update the limit order book 20 | 3. If the websocket feed looses connection, try to re-subscribe again 21 | 22 | ## 4. Appendix 23 | Link to official Bitfinex documentation: https://docs.bitfinex.com/v2/docs 24 | -------------------------------------------------------------------------------- /data_recorder/bitfinex_connector/__init__.py: -------------------------------------------------------------------------------- 1 | from . import * 2 | -------------------------------------------------------------------------------- /data_recorder/bitfinex_connector/bitfinex_book.py: -------------------------------------------------------------------------------- 1 | from configurations import RECORD_DATA 2 | from data_recorder.connector_components.book import Book 3 | 4 | 5 | class BitfinexBook(Book): 6 | 7 | def __init__(self, **kwargs): 8 | super(BitfinexBook, self).__init__(**kwargs) 9 | 10 | def insert_order(self, msg: dict) -> None: 11 | """ 12 | Create new node. 13 | 14 | :param msg: incoming new order 15 | :return: (void) 16 | """ 17 | self.order_map[msg['order_id']] = msg 18 | 19 | price = msg['price'] 20 | if price not in self.price_dict: 21 | self.create_price(price) 22 | 23 | size = abs(msg['size']) 24 | self.price_dict[price].add_limit(quantity=size, price=price) 25 | self.price_dict[price].add_quantity(quantity=size, price=price) 26 | self.price_dict[price].add_count() 27 | 28 | def match(self, msg: dict) -> None: 29 | """ 30 | This method is not implemented within Bitmex's API. 31 | 32 | However, I've implemented it to capture order arrival flows (i.e., incoming 33 | market orders.) and to be consistent with the overarching design pattern. 34 | 35 | Note: this event handler does not impact the LOB in any other way than updating 36 | the number of market orders received at a given price level. 37 | 38 | :param msg: buy or sell transaction message from Bitfinex 39 | :return: (void) 40 | """ 41 | price = msg.get('price', None) 42 | if price in self.price_dict: 43 | quantity = abs(msg['size']) 44 | self.price_dict[price].add_market(quantity=quantity, price=price) 45 | 46 | def change(self, msg: dict) -> None: 47 | """ 48 | Update inventory. 49 | 50 | :param msg: order update message from Bitfinex 51 | :return: (void) 52 | """ 53 | old_order = self.order_map[msg['order_id']] 54 | diff = msg['size'] - old_order['size'] 55 | 56 | vol_change = diff != float(0) 57 | px_change = msg['price'] != old_order['price'] 58 | 59 | if px_change: 60 | self.remove_order(old_order) 61 | self.insert_order(msg) 62 | 63 | elif vol_change: 64 | old_order['size'] = msg['size'] 65 | price = old_order['price'] 66 | self.order_map[msg['order_id']] = old_order 67 | self.price_dict[price].add_quantity(quantity=diff, price=price) 68 | 69 | def remove_order(self, msg: dict) -> None: 70 | """ 71 | Done messages result in the order being removed from map. 72 | 73 | :param msg: remove order message from Bitfinex 74 | :return: (void) 75 | """ 76 | msg_order_id = msg.get('order_id', None) 77 | if msg_order_id in self.order_map: 78 | 79 | old_order = self.order_map[msg_order_id] 80 | price = old_order['price'] 81 | 82 | if price not in self.price_dict: 83 | print('remove_order: price not in msg...adj_price = {} '.format( 84 | price)) 85 | print('Incoming order: %s' % msg) 86 | print('Old order: %s' % old_order) 87 | 88 | order_size = abs(old_order.get('size', None)) 89 | order_price = old_order.get('price', None) 90 | # Note: Bitfinex does not have 'canceled' message types, thus it is not 91 | # possible to distinguish filled orders from canceled orders with the order 92 | # arrival trackers. 93 | self.price_dict[price].add_cancel(quantity=order_size, price=order_price) 94 | self.price_dict[price].remove_quantity(quantity=order_size, price=order_price) 95 | self.price_dict[price].remove_count() 96 | 97 | if self.price_dict[price].count == 0: 98 | self.remove_price(price) 99 | 100 | del self.order_map[old_order['order_id']] 101 | 102 | elif RECORD_DATA: 103 | print('remove_order: order_id not found %s\n' % msg) 104 | -------------------------------------------------------------------------------- /data_recorder/bitfinex_connector/bitfinex_client.py: -------------------------------------------------------------------------------- 1 | import json 2 | 3 | import websockets 4 | 5 | from configurations import BITFINEX_ENDPOINT, LOGGER 6 | from data_recorder.bitfinex_connector.bitfinex_orderbook import BitfinexOrderBook 7 | from data_recorder.connector_components.client import Client 8 | 9 | 10 | class BitfinexClient(Client): 11 | 12 | def __init__(self, **kwargs): 13 | """ 14 | Constructor for Bitfinex Client. 15 | 16 | :param sym: Instrument or cryptocurrency pair name 17 | :param db_queue: (optional) queue to connect to the database process 18 | """ 19 | super(BitfinexClient, self).__init__(exchange='bitfinex', **kwargs) 20 | self.request = json.dumps({ 21 | "event": "subscribe", 22 | "channel": "book", 23 | "prec": "R0", 24 | "freq": "F0", 25 | "symbol": self.sym, 26 | "len": "100" 27 | }) 28 | self.request_unsubscribe = None 29 | self.trades_request = json.dumps({ 30 | "event": "subscribe", 31 | "channel": "trades", 32 | "symbol": self.sym 33 | }) 34 | self.book = BitfinexOrderBook(sym=self.sym) 35 | self.ws_endpoint = BITFINEX_ENDPOINT 36 | 37 | async def unsubscribe(self) -> None: 38 | """ 39 | Send unsubscribe requests to exchange. 40 | 41 | Returns 42 | ------- 43 | 44 | """ 45 | for channel in self.book.channel_id: 46 | self.request_unsubscribe = { 47 | "event": "unsubscribe", 48 | "chanId": channel 49 | } 50 | await super(BitfinexClient, self).unsubscribe() 51 | 52 | def run(self) -> None: 53 | """ 54 | Handle incoming level 3 data on a separate thread or process. 55 | 56 | Returns 57 | ------- 58 | 59 | """ 60 | super(BitfinexClient, self).run() 61 | while True: 62 | msg = self.queue.get() 63 | 64 | if self.book.new_tick(msg) is False: 65 | self.retry_counter += 1 66 | self.book.clear_book() 67 | LOGGER.info('\n[%s - %s] ...going to try and reload the order book\n' 68 | % (self.exchange.upper(), self.sym)) 69 | raise websockets.ConnectionClosed(10001, '%s: no explanation' % 70 | self.exchange.upper()) 71 | # raise an exception to invoke reconnecting 72 | -------------------------------------------------------------------------------- /data_recorder/bitfinex_connector/bitfinex_orderbook.py: -------------------------------------------------------------------------------- 1 | from time import time 2 | 3 | import numpy as np 4 | 5 | from configurations import LOGGER 6 | from data_recorder.connector_components.orderbook import OrderBook 7 | 8 | 9 | class BitfinexOrderBook(OrderBook): 10 | 11 | def __init__(self, **kwargs): 12 | super(BitfinexOrderBook, self).__init__(exchange='bitfinex', **kwargs) 13 | self.channel_id = {'book': int(0), 'trades': int(0)} 14 | 15 | def new_tick(self, msg: dict): 16 | """ 17 | Method to process incoming ticks. 18 | 19 | :param msg: incoming tick 20 | :return: False if there is an exception (or need to reconnect the WebSocket) 21 | """ 22 | # check for data messages, which only come in lists 23 | if isinstance(msg, list): 24 | if msg[0] == self.channel_id['book']: 25 | return self._process_book(msg) 26 | elif msg[0] == self.channel_id['trades']: 27 | return self._process_trades(msg) 28 | 29 | # non-data messages 30 | elif isinstance(msg, dict): 31 | if 'event' in msg: 32 | return self._process_events(msg) 33 | elif msg['type'] == 'te': 34 | self.last_tick_time = msg.get('system_time', None) 35 | return self._process_trades_replay(msg) 36 | elif msg['type'] in ['update', 'preload']: 37 | self.last_tick_time = msg.get('system_time', None) 38 | return self._process_book_replay(msg) 39 | elif msg['type'] == 'load_book': 40 | self.clear_book() 41 | return True 42 | elif msg['type'] == 'book_loaded': 43 | self.bids.warming_up = False 44 | self.asks.warming_up = False 45 | return True 46 | else: 47 | LOGGER.info('new_tick() message does not know how to be processed = %s' % 48 | str(msg)) 49 | 50 | # unhandled exception 51 | else: 52 | LOGGER.warn('unhandled exception\n%s\n' % msg) 53 | return True 54 | 55 | def _load_book(self, book): 56 | """ 57 | Load initial limit order book snapshot 58 | :param book: order book snapshot 59 | :return: void 60 | """ 61 | start_time = time() 62 | 63 | self.db.new_tick({'type': 'load_book', 'product_id': self.sym}) 64 | 65 | for row in book[1]: 66 | order = { 67 | "order_id": int(row[0]), 68 | "price": float(row[1]), 69 | "size": float(abs(row[2])), 70 | "side": 'sell' if float(row[2]) < float(0) else 'buy', 71 | "product_id": self.sym, 72 | "type": 'preload' 73 | } 74 | self.db.new_tick(order) 75 | 76 | if order['side'] == 'buy': 77 | self.bids.insert_order(order) 78 | else: 79 | self.asks.insert_order(order) 80 | 81 | self.db.new_tick({'type': 'book_loaded', 'product_id': self.sym}) 82 | 83 | self.bids.warming_up = self.asks.warming_up = False 84 | 85 | elapsed = time() - start_time 86 | LOGGER.info('%s: book loaded..............in %f seconds\n' % (self.sym, elapsed)) 87 | 88 | def _process_book(self, msg): 89 | """ 90 | Internal method to process FULL BOOK market data 91 | :param msg: incoming tick 92 | :return: False if re-subscribe is required 93 | """ 94 | # check for a heartbeat 95 | if msg[1] == 'hb': 96 | # render_book('heart beat %s' % msg) 97 | return True 98 | # order book message (initial snapshot) 99 | elif np.shape(msg[1])[0] > 3: 100 | LOGGER.info('%s loading book...' % self.sym) 101 | self.clear_book() 102 | self._load_book(msg) 103 | return True 104 | else: 105 | # else, the incoming message is a order update 106 | order = { 107 | "order_id": int(msg[1][0]), 108 | "price": float(msg[1][1]), 109 | "size": float(abs(msg[1][2])), 110 | "side": 'sell' if float(msg[1][2]) < float(0) else 'buy', 111 | "product_id": self.sym, 112 | "type": 'update' 113 | } 114 | 115 | # order should be removed from the book 116 | if order['price'] == 0.: 117 | if order['side'] == 'buy': 118 | self.bids.remove_order(order) 119 | elif order['side'] == 'sell': 120 | self.asks.remove_order(order) 121 | # order is a new order or size update for bids 122 | elif order['side'] == 'buy': 123 | if order['order_id'] in self.bids.order_map: 124 | self.bids.change(order) 125 | else: 126 | self.bids.insert_order(order) 127 | # order is a new order or size update for asks 128 | elif order['side'] == 'sell': 129 | if order['order_id'] in self.asks.order_map: 130 | self.asks.change(order) 131 | else: 132 | self.asks.insert_order(order) 133 | # unhandled msg 134 | else: 135 | raise ValueError('\nUnhandled list msg %s' % msg) 136 | 137 | return True 138 | 139 | def _process_book_replay(self, order): 140 | """ 141 | Internal method to process FULL BOOK market data 142 | :param order: incoming tick 143 | :return: False if re-subscription in required 144 | """ 145 | # clean up the data types 146 | order['price'] = float(order['price']) 147 | order['size'] = float(order['size']) 148 | 149 | if order['type'] == 'update': 150 | # order should be removed from the book 151 | if order['price'] == float(0): 152 | if order['side'] == 'buy': 153 | self.bids.remove_order(order) 154 | elif order['side'] == 'sell': 155 | self.asks.remove_order(order) 156 | # order is a new order or size update for bids 157 | elif order['side'] == 'buy': 158 | if order['order_id'] in self.bids.order_map: 159 | self.bids.change(order) 160 | else: 161 | self.bids.insert_order(order) 162 | # order is a new order or size update for asks 163 | elif order['side'] == 'sell': 164 | if order['order_id'] in self.asks.order_map: 165 | self.asks.change(order) 166 | else: 167 | self.asks.insert_order(order) 168 | # unhandled tick message 169 | else: 170 | raise ValueError('_process_book_replay: unhandled message\n%s' % str(order)) 171 | 172 | elif order['type'] == 'preload': 173 | if order['side'] == 'buy': 174 | self.bids.insert_order(order) 175 | else: 176 | self.asks.insert_order(order) 177 | 178 | elif order['type'] == 'te': 179 | trade_notional = order['price'] * order['size'] 180 | if order['side'] == 'upticks': 181 | self.buy_tracker.add(notional=trade_notional) 182 | self.asks.match(order) 183 | else: 184 | self.sell_tracker.add(notional=trade_notional) 185 | self.bids.match(order) 186 | 187 | else: 188 | raise ValueError('_process_book_replay() Unhandled list msg %s' % order) 189 | 190 | return True 191 | 192 | def _process_trades(self, msg): 193 | """ 194 | Internal method to process trade messages 195 | :param msg: incoming tick 196 | :return: False if a re-subscribe is required 197 | """ 198 | if len(msg) == 2: 199 | # historical trades 200 | return True 201 | 202 | msg_type = msg[1] 203 | side = 'upticks' if msg[2][2] > 0.0 else 'downticks' 204 | 205 | if msg_type == 'hb': 206 | LOGGER.info('Heartbeat for trades') 207 | return True 208 | 209 | elif msg_type == 'te': 210 | trade = { 211 | 'price': float(msg[2][3]), 212 | 'size': float(msg[2][2]), 213 | 'side': side, 214 | 'type': msg_type, 215 | "product_id": self.sym 216 | } 217 | self.db.new_tick(trade) 218 | return self._process_trades_replay(msg=trade) 219 | 220 | return True 221 | 222 | def _process_trades_replay(self, msg): 223 | trade_notional = msg['price'] * msg['size'] 224 | if msg['side'] == 'upticks': 225 | self.buy_tracker.add(notional=trade_notional) 226 | self.asks.match(msg) 227 | else: 228 | self.sell_tracker.add(notional=trade_notional) 229 | self.bids.match(msg) 230 | return True 231 | 232 | def _process_events(self, msg): 233 | """ 234 | Internal method for return code processing 235 | :param msg: incoming message from WebSocket 236 | :return: False if subscription is required 237 | """ 238 | if msg['event'] == 'subscribed': 239 | self.channel_id[msg['channel']] = msg['chanId'] 240 | LOGGER.info('%s Added channel_id: %i for %s' % (self.sym, msg['chanId'], 241 | msg['channel'])) 242 | return True 243 | 244 | elif msg['event'] == 'info': 245 | 246 | if 'code' in msg: 247 | code = msg['code'] 248 | else: 249 | code = None 250 | 251 | if code == 20051: 252 | LOGGER.info('\nBitfinex - %s: 20051 Stop/Restart WebSocket Server ' 253 | '(please reconnect)' % self.sym) 254 | return False # need to re-subscribe to the data feed 255 | elif code == 20060: 256 | LOGGER.info('\nBitfinex - ' + self.sym + ': 20060.' 257 | ' Entering in Maintenance mode. ' 258 | + 'Please pause any activity and resume after receiving the ' 259 | + 'info message 20061 (it should take 120 seconds at most).') 260 | return True 261 | elif code == 20061: 262 | LOGGER.info('\nBitfinex - ' + self.sym + ': 20061 Maintenance ended. ' + 263 | 'You can resume normal activity. ' + 264 | 'It is advised to unsubscribe/subscribe again all channels.') 265 | return False # need to re-subscribe to the data feed 266 | elif code == 10300: 267 | LOGGER.info('\nBitfinex - %s: 10300 Subscription failed (generic)' % self.sym) 268 | return True 269 | elif code == 10301: 270 | LOGGER.info('\nBitfinex - %s: 10301 Already subscribed' % self.sym) 271 | return True 272 | elif code == 10302: 273 | LOGGER.info('\nBitfinex - %s: 10302 Unknown channel' % self.sym) 274 | return True 275 | elif code == 10400: 276 | LOGGER.info('\nBitfinex - %s: 10400 Subscription failed (generic)' % self.sym) 277 | return True 278 | elif code == 10401: 279 | LOGGER.info('\nBitfinex - %s: 10401 Not subscribed' % self.sym) 280 | return True 281 | -------------------------------------------------------------------------------- /data_recorder/coinbase_connector/README.md: -------------------------------------------------------------------------------- 1 | # Coinbase Pro Connector 2 | As of March 04, 2019. 3 | 4 | ## 1. Overview 5 | The Coinbase connector consists of three classes: 6 | 1. `coinbase_book.py` which is the gdax implementation of `./connector_components/book.py` 7 | 2. `coinbase_orderbook.py` which is the gdax implementation of `./connector_components/orderbook.py` 8 | 3. `coinbase_client.py` which is the gdax implementation of `./connector_components/client.py` 9 | 10 | ## 2. Subscriptions 11 | - WebSocket connections are made asynchronously using the `websockets` module 12 | - Full order book data subscriptions are made by using the `subscribe()` method 13 | from `coinbase_client.py` 14 | - Orderbook snapshots are made using a GET call using the `requests` module 15 | - All websocket subscriptions pass incoming messages into a `multiprocessing.Queue()` and 16 | to be processed by a separate thread 17 | 18 | ## 3. Data Consumption Rules 19 | 1. Filter out messages with `type` = `received` to save time 20 | 2. Normalize incoming data messages from strings to numbers, such as `floats()` 21 | 3. Pass normalized messages to the `orderbook.new_tick()` method to update the limit order book 22 | 4. If the websocket feed looses connection, try to re-subscribe again 23 | 24 | ## 4. Appendix 25 | Link to official Coinbase documentation: https://docs.pro.coinbase.com/ 26 | -------------------------------------------------------------------------------- /data_recorder/coinbase_connector/__init__.py: -------------------------------------------------------------------------------- 1 | from . import * 2 | -------------------------------------------------------------------------------- /data_recorder/coinbase_connector/coinbase_book.py: -------------------------------------------------------------------------------- 1 | from configurations import LOGGER, RECORD_DATA 2 | from data_recorder.connector_components.book import Book 3 | 4 | 5 | class CoinbaseBook(Book): 6 | 7 | def __init__(self, **kwargs): 8 | """ 9 | Coinbase Book constructor. 10 | """ 11 | super(CoinbaseBook, self).__init__(**kwargs) 12 | 13 | def insert_order(self, msg: dict) -> None: 14 | """ 15 | Create new node. 16 | 17 | :param msg: incoming order message 18 | """ 19 | msg_order_id = msg.get('order_id', None) 20 | if msg_order_id not in self.order_map: 21 | order = { 22 | 'order_id': msg_order_id, 23 | 'price': float(msg['price']), 24 | 'size': float(msg.get('size') or msg['remaining_size']), 25 | 'side': msg['side'], 26 | 'time': msg['time'], 27 | 'type': msg['type'], 28 | 'product_id': msg['product_id'] 29 | } 30 | self.order_map[order['order_id']] = order 31 | price = order.get('price', None) 32 | size = order.get('size', None) 33 | 34 | if price not in self.price_dict: 35 | self.create_price(price) 36 | 37 | self.price_dict[price].add_limit(quantity=size, price=price) 38 | self.price_dict[price].add_quantity(quantity=size, price=price) 39 | self.price_dict[price].add_count() 40 | 41 | def match(self, msg: dict) -> None: 42 | """ 43 | Change volume of book. 44 | 45 | :param msg: incoming order message 46 | """ 47 | msg_order_id = msg.get('maker_order_id', None) 48 | if msg_order_id in self.order_map: 49 | old_order = self.order_map[msg_order_id] 50 | order = { 51 | 'order_id': msg_order_id, 52 | 'price': float(msg['price']), 53 | 'size': float(msg['size']), 54 | 'side': msg['side'], 55 | 'time': msg['time'], 56 | 'type': msg['type'], 57 | 'product_id': msg['product_id'] 58 | } 59 | price = order['price'] 60 | if price in self.price_dict: 61 | remove_size = order['size'] 62 | remaining_size = old_order['size'] - remove_size 63 | order['size'] = remaining_size 64 | self.order_map[old_order['order_id']] = order 65 | old_order_price = old_order.get('price', None) 66 | self.price_dict[price].add_market(quantity=remove_size, 67 | price=old_order_price) 68 | self.price_dict[price].remove_quantity(quantity=remove_size, 69 | price=old_order_price) 70 | else: 71 | LOGGER.info('\nmatch: price not in tree already [%s]\n' % msg) 72 | elif RECORD_DATA: 73 | LOGGER.warn('\n%s match: order id cannot be found for %s\n' % (self.sym, msg)) 74 | 75 | def change(self, msg: dict) -> None: 76 | """ 77 | Update inventory. 78 | 79 | :param msg: incoming order message 80 | """ 81 | if 'price' in msg: 82 | msg_order_id = msg.get('order_id', None) 83 | if msg_order_id in self.order_map: 84 | old_order = self.order_map[msg_order_id] 85 | new_size = float(msg['new_size']) 86 | old_size = old_order['size'] 87 | diff = old_size - new_size 88 | old_order['size'] = new_size 89 | self.order_map[old_order['order_id']] = old_order 90 | old_order_price = old_order.get('price', None) 91 | self.price_dict[old_order_price].remove_quantity(quantity=diff, 92 | price=old_order_price) 93 | elif RECORD_DATA: 94 | LOGGER.info('\n%s change: missing order_ID [%s] from order_map\n' % 95 | (self.sym, msg)) 96 | 97 | def remove_order(self, msg: dict) -> None: 98 | """ 99 | Done messages result in the order being removed from map. 100 | 101 | :param msg: incoming order message 102 | """ 103 | msg_order_id = msg.get('order_id', None) 104 | if msg_order_id in self.order_map: 105 | 106 | old_order = self.order_map[msg_order_id] 107 | price = old_order.get('price', None) 108 | 109 | if price in self.price_dict: 110 | if msg.get('reason', None) == 'canceled': 111 | self.price_dict[price].add_cancel( 112 | quantity=float(msg.get('remaining_size')), price=price) 113 | 114 | self.price_dict[price].remove_quantity( 115 | quantity=old_order['size'], price=price) 116 | self.price_dict[price].remove_count() 117 | 118 | if self.price_dict[price].count == 0: 119 | self.remove_price(price) 120 | 121 | elif RECORD_DATA: 122 | LOGGER.info('%s remove_order: price not in price_map [%s]' % 123 | (msg['product_id'], str(price))) 124 | 125 | del self.order_map[msg_order_id] 126 | -------------------------------------------------------------------------------- /data_recorder/coinbase_connector/coinbase_client.py: -------------------------------------------------------------------------------- 1 | import json 2 | 3 | from configurations import COINBASE_ENDPOINT, LOGGER 4 | from data_recorder.coinbase_connector.coinbase_orderbook import CoinbaseOrderBook 5 | from data_recorder.connector_components.client import Client 6 | 7 | 8 | class CoinbaseClient(Client): 9 | 10 | def __init__(self, **kwargs): 11 | """ 12 | Constructor for Coinbase Client. 13 | """ 14 | super(CoinbaseClient, self).__init__(exchange='coinbase', **kwargs) 15 | self.request = json.dumps(dict(type='subscribe', 16 | product_ids=[self.sym], 17 | channels=['full'])) 18 | self.request_unsubscribe = json.dumps(dict(type='unsubscribe', 19 | product_ids=[self.sym], 20 | channels=['full'])) 21 | self.book = CoinbaseOrderBook(sym=self.sym) 22 | self.trades_request = None 23 | self.ws_endpoint = COINBASE_ENDPOINT 24 | 25 | def run(self): 26 | """ 27 | Handle incoming level 3 data on a separate thread or process. 28 | 29 | Returns 30 | ------- 31 | 32 | """ 33 | super(CoinbaseClient, self).run() 34 | while True: 35 | msg = self.queue.get() 36 | 37 | if self.book.new_tick(msg) is False: 38 | # Coinbase requires a REST call to GET the initial LOB snapshot 39 | self.book.load_book() 40 | self.retry_counter += 1 41 | LOGGER.info('\n[%s - %s] ...going to try and reload the order ' 42 | 'book\n' % (self.exchange.upper(), self.sym)) 43 | continue 44 | -------------------------------------------------------------------------------- /data_recorder/coinbase_connector/coinbase_orderbook.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime as dt 2 | from time import time 3 | 4 | import numpy as np 5 | import requests 6 | 7 | from configurations import COINBASE_BOOK_ENDPOINT, LOGGER, TIMEZONE 8 | from data_recorder.connector_components.orderbook import OrderBook 9 | 10 | 11 | class CoinbaseOrderBook(OrderBook): 12 | 13 | def __init__(self, **kwargs): 14 | """ 15 | Coinbase Order Book constructor. 16 | 17 | :param sym: Instrument or cryptocurrency pair name 18 | :param db_queue: (optional) queue to connect to the database process 19 | """ 20 | super(CoinbaseOrderBook, self).__init__(exchange='coinbase', **kwargs) 21 | self.sequence = 0 22 | self.diff = 0 23 | 24 | def _get_book(self) -> dict: 25 | """ 26 | Get order book snapshot. 27 | 28 | :return: order book 29 | """ 30 | LOGGER.info('%s get_book request made.' % self.sym) 31 | start_time = time() 32 | 33 | self.clear_book() 34 | path = (COINBASE_BOOK_ENDPOINT % self.sym) 35 | book = requests.get(path, params={'level': 3}).json() 36 | 37 | elapsed = time() - start_time 38 | LOGGER.info('%s get_book request completed in %f seconds.' % (self.sym, elapsed)) 39 | return book 40 | 41 | def load_book(self) -> None: 42 | """ 43 | Load initial limit order book snapshot. 44 | """ 45 | book = self._get_book() 46 | 47 | start_time = time() 48 | 49 | self.sequence = book['sequence'] 50 | now = dt.now(tz=TIMEZONE) 51 | load_time = str(now) 52 | 53 | self.db.new_tick({ 54 | 'type': 'load_book', 55 | 'product_id': self.sym, 56 | 'sequence': self.sequence 57 | }) 58 | 59 | for bid in book['bids']: 60 | msg = { 61 | 'price': float(bid[0]), 62 | 'size': float(bid[1]), 63 | 'order_id': bid[2], 64 | 'side': 'buy', 65 | 'product_id': self.sym, 66 | 'type': 'preload', 67 | 'sequence': self.sequence, 68 | 'time': load_time, 69 | } 70 | self.db.new_tick(msg) 71 | self.bids.insert_order(msg) 72 | 73 | for ask in book['asks']: 74 | msg = { 75 | 'price': float(ask[0]), 76 | 'size': float(ask[1]), 77 | 'order_id': ask[2], 78 | 'side': 'sell', 79 | 'product_id': self.sym, 80 | 'type': 'preload', 81 | 'sequence': self.sequence, 82 | 'time': load_time, 83 | } 84 | self.db.new_tick(msg) 85 | self.asks.insert_order(msg) 86 | 87 | self.db.new_tick({ 88 | 'type': 'book_loaded', 89 | 'product_id': self.sym, 90 | 'sequence': self.sequence 91 | }) 92 | del book 93 | self.bids.warming_up = self.asks.warming_up = False 94 | 95 | elapsed = time() - start_time 96 | LOGGER.info('%s: book loaded................in %f seconds' % (self.sym, elapsed)) 97 | 98 | def new_tick(self, msg: dict) -> bool: 99 | """ 100 | Method to process incoming ticks. 101 | 102 | :param msg: incoming tick 103 | :return: False if there is an exception 104 | """ 105 | message_type = msg['type'] 106 | if 'sequence' not in msg: 107 | if message_type == 'subscriptions': 108 | # request an order book snapshot after the 109 | # websocket feed is established 110 | LOGGER.info('Coinbase Subscriptions successful for : %s' % self.sym) 111 | self.load_book() 112 | return True 113 | elif np.isnan(msg['sequence']): 114 | # this situation appears during data replays 115 | # (and not in live data feeds) 116 | LOGGER.warn('\n%s found a nan in the sequence' % self.sym) 117 | return True 118 | 119 | # check the incoming message sequence to verify if there 120 | # is a dropped/missed message. 121 | # If so, request a new orderbook snapshot from Coinbase Pro. 122 | new_sequence = int(msg['sequence']) 123 | self.diff = new_sequence - self.sequence 124 | 125 | if self.diff == 1: 126 | # tick sequences increase by an increment of one 127 | self.sequence = new_sequence 128 | elif message_type in ['load_book', 'book_loaded', 'preload']: 129 | # message types used for data replays 130 | self.sequence = new_sequence 131 | elif self.diff <= 0: 132 | if message_type in ['received', 'open', 'done', 'match', 'change']: 133 | LOGGER.info('%s [%s] has a stale tick: current %i | incoming %i' % ( 134 | self.sym, message_type, self.sequence, new_sequence)) 135 | return True 136 | else: 137 | LOGGER.warn('UNKNOWN-%s %s has a stale tick: current %i | incoming %i' % ( 138 | self.sym, message_type, self.sequence, new_sequence)) 139 | return True 140 | else: # when the tick sequence difference is greater than 1 141 | LOGGER.info('sequence gap: %s missing %i messages. new_sequence: %i [%s]\n' % 142 | (self.sym, self.diff, new_sequence, message_type)) 143 | self.sequence = new_sequence 144 | return False 145 | 146 | # persist data to Arctic Tick Store 147 | self.db.new_tick(msg) 148 | self.last_tick_time = msg.get('time', None) 149 | # make sure CONFIGS.RECORDING is false when replaying data 150 | 151 | side = msg['side'] 152 | if message_type == 'received': 153 | return True 154 | 155 | elif message_type == 'open': 156 | if side == 'buy': 157 | self.bids.insert_order(msg) 158 | return True 159 | else: 160 | self.asks.insert_order(msg) 161 | return True 162 | 163 | elif message_type == 'done': 164 | if side == 'buy': 165 | self.bids.remove_order(msg) 166 | return True 167 | else: 168 | self.asks.remove_order(msg) 169 | return True 170 | 171 | elif message_type == 'match': 172 | trade_notional = float(msg['price']) * float(msg['size']) 173 | if side == 'buy': # trades matched on the bids book are considered sells 174 | self.sell_tracker.add(notional=trade_notional) 175 | self.bids.match(msg) 176 | return True 177 | else: # trades matched on the asks book are considered buys 178 | self.buy_tracker.add(notional=trade_notional) 179 | self.asks.match(msg) 180 | return True 181 | 182 | elif message_type == 'change': 183 | if side == 'buy': 184 | self.bids.change(msg) 185 | return True 186 | else: 187 | self.asks.change(msg) 188 | return True 189 | 190 | elif message_type == 'preload': 191 | if side == 'buy': 192 | self.bids.insert_order(msg) 193 | return True 194 | else: 195 | self.asks.insert_order(msg) 196 | return True 197 | 198 | elif message_type == 'load_book': 199 | self.clear_book() 200 | return True 201 | 202 | elif message_type == 'book_loaded': 203 | self.bids.warming_up = self.asks.warming_up = False 204 | LOGGER.info("Book finished loading at {}".format(self.last_tick_time)) 205 | return True 206 | 207 | else: 208 | LOGGER.warn('\n\n\nunhandled message type\n%s\n\n' % str(msg)) 209 | return False 210 | -------------------------------------------------------------------------------- /data_recorder/connector_components/README.md: -------------------------------------------------------------------------------- 1 | # Connector Components 2 | As of September 09, 2019. 3 | 4 | ## 1. Overview 5 | The `connector_components` module contains the base classes for 6 | connecting to crypto exchanges. Each base class is overriden in the 7 | following modules: 8 | - `bitfinex_connector/` 9 | - `coinbase_connector/` 10 | - `bitmex_connector/` 11 | 12 | 13 | ## 2. Classes 14 | 15 | ### 2.1 Book 16 | This class is responsible for maintaining the inventory all of the buy 17 | **or** sell orders through implementing the `./price_level.py` class. 18 | 19 | ### 2.2 Client 20 | This class is responsible for creating WebSocket connections to an 21 | exchange endpoint. 22 | 23 | ### 2.3 Order Book 24 | This class is responsible for implementing the `./book.py` class for 25 | both buy and sell orders, thus making an actual order book. 26 | 27 | ### 2.4 Price Level 28 | This class is responsible for keeping track order inventories at a given 29 | price. This class is instantiated for every price level in the limit 30 | order book. Order flow arrival attributes are reset each time a LOB 31 | snapshot is taken. 32 | 33 | ### 2.5 Trade Tracker 34 | This class is responsible for keeping track of the time and sales 35 | transactional data. Order flow arrival attributes are reset each time a 36 | LOB snapshot is taken. -------------------------------------------------------------------------------- /data_recorder/connector_components/__init__.py: -------------------------------------------------------------------------------- 1 | from . import * 2 | -------------------------------------------------------------------------------- /data_recorder/connector_components/book.py: -------------------------------------------------------------------------------- 1 | from abc import ABC, abstractmethod 2 | 3 | import numpy as np 4 | from sortedcontainers import SortedDict 5 | 6 | from configurations import INCLUDE_ORDERFLOW, MAX_BOOK_ROWS 7 | from data_recorder.connector_components.price_level import PriceLevel 8 | 9 | 10 | class Book(ABC): 11 | CLEAR_MAX_ROWS = MAX_BOOK_ROWS + 45 12 | 13 | def __init__(self, sym: str, side: str): 14 | """ 15 | Book constructor. 16 | 17 | :param sym: currency symbol 18 | :param side: 'bids' or 'asks' 19 | """ 20 | self.price_dict = SortedDict() 21 | self.order_map = dict() 22 | self.side = side 23 | self.sym = sym 24 | self.warming_up = True 25 | # render order book using numpy for faster performance 26 | # LOB statistics 27 | self._distances = np.empty(MAX_BOOK_ROWS, dtype=np.float32) 28 | self._notionals = np.empty(MAX_BOOK_ROWS, dtype=np.float32) 29 | self._cumulative_notionals = np.empty(MAX_BOOK_ROWS, dtype=np.float32) 30 | # order flow arrival statistics 31 | self._cancel_notionals = np.empty(MAX_BOOK_ROWS, dtype=np.float32) 32 | self._limit_notionals = np.empty(MAX_BOOK_ROWS, dtype=np.float32) 33 | self._market_notionals = np.empty(MAX_BOOK_ROWS, dtype=np.float32) 34 | 35 | def __str__(self): 36 | if self.warming_up: 37 | message = 'warming up' 38 | elif self.side == 'asks': 39 | ask_price, ask_price_level = self.get_ask() 40 | message = '{:>8,.2f} x {:>9,.0f} | {:>9,.0f}'.format( 41 | ask_price, 42 | # ask_price_level.quantity, 43 | ask_price_level.notional, 44 | ask_price_level.limit_notional - ask_price_level.cancel_notional + 45 | ask_price_level.market_notional, 46 | ) 47 | else: 48 | bid_price, bid_price_level = self.get_bid() 49 | message = '{:>9,.0f} | {:>9,.0f} x {:>8,.2f}'.format( 50 | bid_price_level.limit_notional - bid_price_level.cancel_notional + 51 | bid_price_level.market_notional, 52 | # bid_price_level.quantity, 53 | bid_price_level.notional, 54 | bid_price 55 | ) 56 | return message 57 | 58 | def clear(self) -> None: 59 | """ 60 | Reset price tree and order map. 61 | 62 | :return: void 63 | """ 64 | self.price_dict = SortedDict() 65 | self.order_map = dict() 66 | self.warming_up = True 67 | 68 | def create_price(self, price: float) -> None: 69 | """ 70 | Create new node. 71 | 72 | :param price: price level to create in LOB 73 | :return: 74 | """ 75 | self.price_dict[price] = PriceLevel(price=price, quantity=0.) 76 | 77 | def remove_price(self, price: float) -> None: 78 | """ 79 | Remove node. 80 | 81 | :param price: price level to remove from LOB 82 | :return: 83 | """ 84 | del self.price_dict[price] 85 | 86 | def receive(self, msg) -> None: 87 | """ 88 | add incoming orders to order map. 89 | 90 | :param msg: new message received by exchange 91 | :return: 92 | """ 93 | pass 94 | 95 | @abstractmethod 96 | def insert_order(self, msg: dict) -> None: 97 | """ 98 | Insert order into existing node. 99 | 100 | :param msg: new limit order 101 | :return: 102 | """ 103 | pass 104 | 105 | @abstractmethod 106 | def match(self, msg: dict) -> None: 107 | """ 108 | Change volume of book; used with time and sales data. 109 | 110 | :param msg: buy or sell execution 111 | :return: 112 | """ 113 | pass 114 | 115 | @abstractmethod 116 | def change(self, msg: dict) -> None: 117 | """ 118 | Update inventory. 119 | 120 | :param msg: update order request 121 | :return: 122 | """ 123 | pass 124 | 125 | @abstractmethod 126 | def remove_order(self, msg: dict) -> None: 127 | """ 128 | Done messages result in the order being removed from map 129 | 130 | :param msg: 131 | :return: 132 | """ 133 | pass 134 | 135 | def get_ask(self) -> (float, PriceLevel): 136 | """ 137 | Best offer 138 | 139 | :return: (float) inside ask, (PriceLevel) ask size and number of orders 140 | """ 141 | if len(self.price_dict) > 0: 142 | return self.price_dict.items()[0] 143 | else: 144 | return 0.0, PriceLevel(price=0., quantity=0.) 145 | 146 | def get_bid(self) -> (float, PriceLevel): 147 | """ 148 | Best bid 149 | 150 | :return: (float) inside bid, (PriceLevel) bid size and number of orders 151 | """ 152 | if len(self.price_dict) > 0: 153 | return self.price_dict.items()[-1] 154 | else: 155 | return 0.0, PriceLevel(price=0., quantity=0.) 156 | 157 | def _add_to_book_trackers(self, 158 | price: float, 159 | midpoint: float, 160 | level: PriceLevel, 161 | cumulative_notional: float, 162 | level_number: int) -> float: 163 | """ 164 | Iterate through LOB and pass cumulative_notional recursively. Implemented in 165 | numpy for speed. 166 | 167 | :param price: raw price of current price-level 'i' in LOB 168 | :param midpoint: midpoint price 169 | :param level: PriceLevel object which stores notionals, quantity, etc. 170 | :param cumulative_notional: cumulative notional value from walking the LOB 171 | :param level_number: level 'i' in LOB 172 | :return: current cumulative notional at level 'i' 173 | """ 174 | # order book stats 175 | self._distances[level_number] = (price / midpoint) - 1. 176 | self._notionals[level_number] = level.notional 177 | cumulative_notional += level.notional 178 | self._cumulative_notionals[level_number] = cumulative_notional # <- Note: not 179 | # implemented yet 180 | return cumulative_notional 181 | 182 | def _add_to_order_flow_trackers(self, level: PriceLevel, level_number: int) -> None: 183 | """ 184 | Iterate through LOB to capture order arrival statistics. Implemented in numpy 185 | for speed. 186 | 187 | :param level: PriceLevel object which stores notionals, quantity, etc. 188 | :param level_number: level 'i' in LOB 189 | """ 190 | # order flow arrival statistics 191 | self._cancel_notionals[level_number] = level.cancel_notional 192 | self._limit_notionals[level_number] = level.limit_notional 193 | self._market_notionals[level_number] = level.market_notional 194 | 195 | def get_asks_to_list(self, midpoint: float) -> tuple: 196 | """ 197 | Walk the LOB to derive: 198 | 1.) price-level distance to midpoint 199 | 2.) notional value of each price-level 200 | **Optional** 201 | 3.) notional values accumulated since last snapshot for cancel, market, 202 | and limit orders 203 | 204 | :param midpoint: current midpoint 205 | :return: tuple containing derived LOB feature set 206 | """ 207 | cumulative_notional = 0. 208 | book_rows_to_clear = Book.CLEAR_MAX_ROWS if INCLUDE_ORDERFLOW else MAX_BOOK_ROWS 209 | 210 | for i, (price, level) in enumerate(self.price_dict.items()[:book_rows_to_clear]): 211 | if i < MAX_BOOK_ROWS: 212 | # only include price levels that are within the specified range 213 | cumulative_notional = self._add_to_book_trackers( 214 | price=price, midpoint=midpoint, level=level, 215 | cumulative_notional=cumulative_notional, level_number=i 216 | ) 217 | self._add_to_order_flow_trackers(level=level, level_number=i) 218 | # clear the trackers on nearby price levels in case of price jumps, 219 | # but do not clear all the price levels in the LOB to save time. 220 | level.clear_trackers() # to prevent orders close to the top-n from 221 | 222 | # append all the data points together 223 | book_data = (self._distances, self._notionals,) 224 | 225 | # include order flow arrival statistics 226 | if INCLUDE_ORDERFLOW: 227 | book_data += (self._cancel_notionals, self._limit_notionals, 228 | self._market_notionals,) 229 | 230 | return book_data 231 | 232 | def get_bids_to_list(self, midpoint: float) -> tuple: 233 | """ 234 | Walk the LOB to derive: 235 | 1.) price-level distance to midpoint 236 | 2.) notional value of each price-level 237 | **Optional** 238 | 3.) notional values accumulated since last snapshot for cancel, market, 239 | and limit orders 240 | 241 | Note: currently configured to return all data slices in ascending order 242 | (not mirroring) 243 | 244 | :param midpoint: current midpoint 245 | :return: tuple containing derived LOB feature set 246 | """ 247 | cumulative_notional = 0. 248 | book_rows_to_clear = Book.CLEAR_MAX_ROWS if INCLUDE_ORDERFLOW else MAX_BOOK_ROWS 249 | 250 | for i, (price, level) in enumerate( 251 | reversed(self.price_dict.items()[-book_rows_to_clear:])): 252 | 253 | if i < MAX_BOOK_ROWS: 254 | # only include price levels that are within the specified range 255 | cumulative_notional = self._add_to_book_trackers( 256 | # price, midpoint, level, cumulative_notional, MAX_BOOK_ROWS - i - 1 257 | price=price, midpoint=midpoint, level=level, 258 | cumulative_notional=cumulative_notional, level_number=i 259 | ) 260 | # self._add_to_order_flow_trackers(level, MAX_BOOK_ROWS - i - 1) 261 | self._add_to_order_flow_trackers(level=level, level_number=i) 262 | # clear the trackers on nearby price levels in case of price jumps, 263 | # but do not clear all the price levels in the LOB to save time. 264 | level.clear_trackers() # to prevent orders close to the top-n from 265 | 266 | # append all the data points together 267 | book_data = (self._distances, self._notionals,) 268 | 269 | # include order flow arrival statistics 270 | if INCLUDE_ORDERFLOW: 271 | book_data += (self._cancel_notionals, self._limit_notionals, 272 | self._market_notionals,) 273 | 274 | return book_data 275 | -------------------------------------------------------------------------------- /data_recorder/connector_components/client.py: -------------------------------------------------------------------------------- 1 | import json 2 | import time 3 | from abc import ABC, abstractmethod 4 | from datetime import datetime as dt 5 | from multiprocessing import Queue 6 | from threading import Thread # , Timer 7 | 8 | import websockets 9 | 10 | from configurations import LOGGER, MAX_RECONNECTION_ATTEMPTS, TIMEZONE # , SNAPSHOT_RATE 11 | 12 | 13 | class Client(Thread, ABC): 14 | 15 | def __init__(self, sym: str, exchange: str): 16 | """ 17 | Client constructor. 18 | 19 | :param sym: currency symbol 20 | :param exchange: 'bitfinex' or 'coinbase' or 'bitmex' 21 | """ 22 | super(Client, self).__init__(name=sym, daemon=True) 23 | self.sym = sym 24 | self.exchange = exchange 25 | self.retry_counter = 0 26 | self.max_retries = MAX_RECONNECTION_ATTEMPTS 27 | self.last_subscribe_time = None 28 | self.last_worker_time = None 29 | self.queue = Queue(maxsize=0) 30 | # Attributes that get overridden in sub-classes 31 | self.ws = None 32 | self.ws_endpoint = None 33 | self.request = self.trades_request = None 34 | self.request_unsubscribe = None 35 | self.book = None 36 | LOGGER.info('%s client instantiated.' % self.exchange.upper()) 37 | 38 | async def subscribe(self) -> None: 39 | """ 40 | Subscribe to full order book. 41 | """ 42 | try: 43 | self.ws = await websockets.connect(self.ws_endpoint) 44 | 45 | if self.request is not None: 46 | LOGGER.info('Requesting Book: {}'.format(self.request)) 47 | await self.ws.send(self.request) 48 | LOGGER.info('BOOK %s: %s subscription request sent.' % 49 | (self.exchange.upper(), self.sym)) 50 | 51 | if self.trades_request is not None: 52 | LOGGER.info('Requesting Trades: {}'.format(self.trades_request)) 53 | await self.ws.send(self.trades_request) 54 | LOGGER.info('TRADES %s: %s subscription request sent.' % 55 | (self.exchange.upper(), self.sym)) 56 | 57 | self.last_subscribe_time = dt.now(tz=TIMEZONE) 58 | 59 | # Add incoming messages to a queue, which is consumed and processed 60 | # in the run() method. 61 | while True: 62 | self.queue.put(json.loads(await self.ws.recv())) 63 | 64 | except websockets.ConnectionClosed as exception: 65 | LOGGER.warn('%s: subscription exception %s' % (self.exchange, exception)) 66 | self.retry_counter += 1 67 | elapsed = (dt.now(tz=TIMEZONE) - self.last_subscribe_time).seconds 68 | 69 | if elapsed < 10: 70 | sleep_time = max(10 - elapsed, 1) 71 | time.sleep(sleep_time) 72 | LOGGER.info('%s - %s is sleeping %i seconds...' % 73 | (self.exchange, self.sym, sleep_time)) 74 | 75 | if self.retry_counter < self.max_retries: 76 | LOGGER.info('%s: Retrying to connect... attempted #%i' % 77 | (self.exchange, self.retry_counter)) 78 | await self.subscribe() # recursion 79 | else: 80 | LOGGER.warn('%s: %s Ran out of reconnection attempts. ' 81 | 'Have already tried %i times.' % 82 | (self.exchange, self.sym, self.retry_counter)) 83 | 84 | async def unsubscribe(self) -> None: 85 | """ 86 | Unsubscribe limit order book WebSocket from exchange. 87 | """ 88 | LOGGER.info('Client - %s sending unsubscribe request for %s.' % 89 | (self.exchange.upper(), self.sym)) 90 | 91 | await self.ws.send(self.request_unsubscribe) 92 | output = json.loads(await self.ws.recv()) 93 | 94 | LOGGER.info('Client - %s: unsubscribe successful.' % (self.exchange.upper())) 95 | LOGGER.info('unsubscribe() -> Output:') 96 | LOGGER.info(output) 97 | 98 | @abstractmethod 99 | def run(self) -> None: 100 | """ 101 | Thread to override in Coinbase or Bitfinex or Bitmex implementation class. 102 | """ 103 | LOGGER.info("run() initiated on : {}".format(self.name)) 104 | self.last_worker_time = dt.now() 105 | # Used for debugging exchanges individually 106 | # Timer(4.0, _timer_worker, args=(self.book, self.last_worker_time,)).start() 107 | 108 | # from data_recorder.connector_components.orderbook import OrderBook 109 | 110 | # Used for debugging exchanges individually 111 | # def _timer_worker(orderbook: OrderBook, last_worker_time: dt) -> None: 112 | # """ 113 | # Thread worker to be invoked every N seconds 114 | # (e.g., configs.SNAPSHOT_RATE) 115 | # 116 | # :param orderbook: OrderBook 117 | # :return: void 118 | # """ 119 | # now = dt.now() 120 | # delta = now - last_worker_time 121 | # print('\n{} - {} with delta {}\n{}'.format(orderbook.sym, now, delta.microseconds, 122 | # orderbook)) 123 | # last_worker_time = now 124 | # 125 | # Timer(SNAPSHOT_RATE, _timer_worker, args=(orderbook, last_worker_time,)).start() 126 | # 127 | # if orderbook.done_warming_up: 128 | # """ 129 | # This is the place to insert a trading model. 130 | # You'll have to create your own. 131 | # 132 | # Example: 133 | # orderbook_data = tuple(coinbaseClient.book, bitfinexClient.book) 134 | # model = agent.dqn.Agent() 135 | # fix_api = SomeFixAPI() 136 | # action = model(orderbook_data) 137 | # if action is buy: 138 | # buy_order = create_order(pair, price, etc.) 139 | # fix_api.send_order(buy_order) 140 | # 141 | # """ 142 | # _ = orderbook.render_book() 143 | -------------------------------------------------------------------------------- /data_recorder/connector_components/orderbook.py: -------------------------------------------------------------------------------- 1 | from abc import ABC, abstractmethod 2 | 3 | import numpy as np 4 | 5 | from configurations import INCLUDE_ORDERFLOW, LOGGER, MAX_BOOK_ROWS 6 | from data_recorder.bitfinex_connector.bitfinex_book import BitfinexBook 7 | from data_recorder.coinbase_connector.coinbase_book import CoinbaseBook 8 | from data_recorder.connector_components.trade_tracker import TradeTracker 9 | from data_recorder.database.database import Database 10 | 11 | BOOK_BY_EXCHANGE = dict(coinbase=CoinbaseBook, bitfinex=BitfinexBook) 12 | 13 | 14 | class OrderBook(ABC): 15 | 16 | def __init__(self, sym: str, exchange: str): 17 | """ 18 | OrderBook constructor. 19 | 20 | :param sym: instrument name 21 | :param exchange: 'coinbase' or 'bitfinex' or 'bitmex' 22 | """ 23 | self.sym = sym 24 | self.db = Database(sym=sym, exchange=exchange) 25 | self.db.init_db_connection() 26 | self.bids = BOOK_BY_EXCHANGE[exchange](sym=sym, side='bids') 27 | self.asks = BOOK_BY_EXCHANGE[exchange](sym=sym, side='asks') 28 | self.exchange = exchange 29 | self.midpoint = float() 30 | self.spread = float() 31 | self.buy_tracker = TradeTracker() 32 | self.sell_tracker = TradeTracker() 33 | self.last_tick_time = None 34 | 35 | def __str__(self): 36 | return '{:>8,.0f} <> {} || {} <> {:>8,.0f}'.format( 37 | self.sell_tracker.notional, self.bids, self.asks, self.buy_tracker.notional) 38 | 39 | @abstractmethod 40 | def new_tick(self, msg: dict) -> bool: 41 | """ 42 | Event handler for incoming tick messages. 43 | 44 | :param msg: incoming order or trade message 45 | :return: FALSE if reconnection to WebSocket is needed, else TRUE if good 46 | """ 47 | return True 48 | 49 | def clear_trade_trackers(self) -> None: 50 | """ 51 | Reset buy and sell trade trackers; used between LOB snapshots. 52 | 53 | :return: (void) 54 | """ 55 | self.buy_tracker.clear() 56 | self.sell_tracker.clear() 57 | 58 | def clear_book(self) -> None: 59 | """ 60 | Method to reset the limit order book. 61 | 62 | :return: (void) 63 | """ 64 | self.bids.clear() # warming_up flag reset in `Position` class 65 | self.asks.clear() # warming_up flag reset in `Position` class 66 | self.last_tick_time = None 67 | LOGGER.info(f"{self.sym}'s order book cleared.") 68 | 69 | def render_book(self) -> np.ndarray: 70 | """ 71 | Create stationary feature set for limit order book. 72 | 73 | :return: LOB feature set 74 | """ 75 | # get price levels of LOB 76 | bid_price, bid_level = self.bids.get_bid() 77 | ask_price, ask_level = self.asks.get_ask() 78 | 79 | # derive midpoint price and spread from bid and ask data 80 | self.midpoint = (ask_price + bid_price) / 2.0 81 | self.spread = round(ask_price - bid_price, 4) # round to clean float rounding 82 | 83 | # transform raw LOB data into stationary feature set 84 | bid_data = self.bids.get_bids_to_list(midpoint=self.midpoint) 85 | ask_data = self.asks.get_asks_to_list(midpoint=self.midpoint) 86 | 87 | # convert buy and sell trade notional values to an array 88 | buy_trades = np.array(self.buy_tracker.notional) 89 | sell_trades = np.array(self.sell_tracker.notional) 90 | 91 | # reset trackers after each LOB render 92 | self.clear_trade_trackers() 93 | 94 | return np.hstack((self.midpoint, self.spread, buy_trades, sell_trades, 95 | *bid_data, *ask_data)) 96 | 97 | @staticmethod 98 | def render_lob_feature_names(include_orderflow: bool = INCLUDE_ORDERFLOW) -> list: 99 | """ 100 | Get the column names for the LOB render features. 101 | 102 | :param include_orderflow: if TRUE, order flow imbalance stats are included in set 103 | :return: list containing features names 104 | """ 105 | feature_names = list() 106 | 107 | feature_names.append('midpoint') 108 | feature_names.append('spread') 109 | feature_names.append('buys') 110 | feature_names.append('sells') 111 | 112 | feature_types = ['distance', 'notional'] 113 | if include_orderflow: 114 | feature_types += ['cancel_notional', 'limit_notional', 'market_notional'] 115 | 116 | for side in ['bids', 'asks']: 117 | for feature in feature_types: 118 | for row in range(MAX_BOOK_ROWS): 119 | feature_names.append(f"{side}_{feature}_{row}") 120 | 121 | LOGGER.info(f"render_feature_names() has {len(feature_names)} features") 122 | 123 | return feature_names 124 | 125 | @property 126 | def best_bid(self) -> float: 127 | """ 128 | Get the best bid. 129 | 130 | :return: float best bid 131 | """ 132 | return self.bids.get_bid() 133 | 134 | @property 135 | def best_ask(self) -> float: 136 | """ 137 | Get the best ask. 138 | 139 | :return: float best ask 140 | """ 141 | return self.asks.get_ask() 142 | 143 | @property 144 | def done_warming_up(self) -> bool: 145 | """ 146 | Flag to indicate if the entire Limit Order Book has been loaded. 147 | 148 | :return: True if loaded / False if still waiting to download 149 | """ 150 | return self.bids.warming_up is False & self.asks.warming_up is False 151 | -------------------------------------------------------------------------------- /data_recorder/connector_components/price_level.py: -------------------------------------------------------------------------------- 1 | class PriceLevel(object): 2 | 3 | def __init__(self, price: float, quantity: float): 4 | """ 5 | PriceLevel constructor. 6 | 7 | :param price: LOB adjust price level 8 | :param quantity: total quantity available at the price 9 | """ 10 | # Core price level attributes 11 | self._price = price # adjusted price level in LOB 12 | self._quantity = quantity # total order size 13 | self._count = 0 # total number of orders 14 | self._notional = 0. # total notional value of orders at price level 15 | # Trackers for order flow 16 | # Inspired by https://arxiv.org/abs/1907.06230v1 17 | self._limit_count = 0 18 | self._limit_quantity = 0. 19 | self._limit_notional = 0. 20 | self._market_count = 0 21 | self._market_quantity = 0. 22 | self._market_notional = 0. 23 | self._cancel_count = 0 24 | self._cancel_quantity = 0. 25 | self._cancel_notional = 0. 26 | 27 | def __str__(self): 28 | level_info = 'PriceLevel: [price={} | quantity={} | notional={}] \n'.format( 29 | self._price, self._quantity, self.notional) 30 | order_flow_info = ('_limit_count={} | _limit_quantity={} | _' 31 | 'market_count={} | ').format( 32 | self._limit_count, self._limit_quantity, self._market_count) 33 | order_flow_info += ('_market_quantity={} | _cancel_count={} | _' 34 | 'cancel_quantity={}').format( 35 | self._market_quantity, self._cancel_count, self._cancel_quantity) 36 | return level_info + order_flow_info 37 | 38 | @property 39 | def price(self) -> float: 40 | """ 41 | Adjusted price of level in LOB. 42 | 43 | :return: price (possibly rounded price, if enabled) of price level 44 | """ 45 | return self._price 46 | 47 | @property 48 | def quantity(self) -> float: 49 | """ 50 | Total order size. 51 | 52 | :return: number of units at price level 53 | """ 54 | return self._quantity 55 | 56 | @property 57 | def count(self) -> int: 58 | """ 59 | Total number of orders. 60 | 61 | :return: number of orders at price level 62 | """ 63 | return self._count 64 | 65 | @property 66 | def notional(self) -> float: 67 | """ 68 | Total notional value of the price level. 69 | 70 | :return: notional value of price level 71 | """ 72 | return round(self._notional, 2) 73 | 74 | @property 75 | def limit_notional(self) -> float: 76 | """ 77 | Total value of incoming limit orders added at the price level. 78 | 79 | :return: notional value of new limit orders received since last `clear_trackers()` 80 | """ 81 | return round(self._limit_notional, 2) 82 | 83 | @property 84 | def market_notional(self) -> float: 85 | """ 86 | Total value of incoming market orders at the price level. 87 | 88 | :return: notional value of market orders received since last `clear_trackers()` 89 | """ 90 | return round(self._market_notional, 2) 91 | 92 | @property 93 | def cancel_notional(self) -> float: 94 | """ 95 | Total value of incoming cancel orders at the price level. 96 | 97 | :return: notional value of cancel orders received since last `clear_trackers()` 98 | """ 99 | return round(self._cancel_notional, 2) 100 | 101 | def add_quantity(self, quantity=0.5, price=100.) -> None: 102 | """ 103 | Add more orders to a given price level. 104 | 105 | :param quantity: order size 106 | :param price: order price 107 | """ 108 | self._quantity += quantity 109 | self._notional += quantity * price 110 | 111 | def remove_quantity(self, quantity=0.5, price=100.) -> None: 112 | """ 113 | Remove more orders to a given price level. 114 | 115 | :param quantity: order size 116 | :param price: order price 117 | """ 118 | self._quantity -= quantity 119 | self._notional -= quantity * price 120 | 121 | def add_count(self) -> None: 122 | """ 123 | Counter for number of orders received at price level. 124 | """ 125 | self._count += 1 126 | 127 | def remove_count(self) -> None: 128 | """ 129 | Counter for number of orders received at price level. 130 | """ 131 | self._count -= 1 132 | 133 | def clear_trackers(self) -> None: 134 | """ 135 | Reset all trackers back to zero at the start of a new LOB snapshot interval. 136 | """ 137 | self._limit_count = 0 138 | self._limit_quantity = 0. 139 | self._limit_notional = 0. 140 | self._market_count = 0 141 | self._market_quantity = 0. 142 | self._market_notional = 0. 143 | self._cancel_count = 0 144 | self._cancel_quantity = 0. 145 | self._cancel_notional = 0. 146 | 147 | def add_limit(self, quantity: float, price: float) -> None: 148 | """ 149 | Add new incoming limit order to trackers. 150 | 151 | :param quantity: order size 152 | :param price: order price 153 | """ 154 | self._limit_count += 1 155 | self._limit_quantity += quantity 156 | self._limit_notional += quantity * price 157 | 158 | def add_market(self, quantity: float, price: float) -> None: 159 | """ 160 | Add new incoming market order to trackers. 161 | 162 | :param quantity: order size 163 | :param price: order price 164 | """ 165 | self._market_count += 1 166 | self._market_quantity += quantity 167 | self._market_notional += quantity * price 168 | 169 | def add_cancel(self, quantity: float, price: float) -> None: 170 | """ 171 | Add new incoming cancel order to trackers. 172 | 173 | :param quantity: order size 174 | :param price: order price 175 | """ 176 | self._cancel_count += 1 177 | self._cancel_quantity += quantity 178 | self._cancel_notional += quantity * price 179 | 180 | def set_notional(self, notional: float) -> None: 181 | """ 182 | Set the notional value of the price level. 183 | 184 | :param notional: notional value (# of units * price) 185 | """ 186 | self._notional = notional 187 | 188 | def add_limit_notional(self, notional: float) -> None: 189 | """ 190 | Add a limit order's notional value to the cumulative sum of notional values 191 | for all the limit orders received at the price level. 192 | 193 | :param notional: notional value (# of units * price) 194 | """ 195 | self._limit_notional += notional 196 | 197 | def add_cancel_notional(self, notional: float) -> None: 198 | """ 199 | Add a cancel limit order's notional value to the cumulative sum of notional 200 | values for all the cancelled limit orders received at the price level. 201 | 202 | :param notional: notional value (# of units * price) 203 | """ 204 | self._cancel_notional += notional 205 | -------------------------------------------------------------------------------- /data_recorder/connector_components/trade_tracker.py: -------------------------------------------------------------------------------- 1 | class TradeTracker(object): 2 | 3 | def __init__(self): 4 | """ 5 | Constructor. 6 | """ 7 | self._notional = 0. 8 | self._count = 0 9 | 10 | def __str__(self): 11 | return 'TradeTracker: [notional={} | count={}]'.format( 12 | self._notional, self._count) 13 | 14 | @property 15 | def notional(self) -> float: 16 | """ 17 | Total notional value of transactions since last TradeTracker.clear(). 18 | 19 | Example: 20 | notional = price * quantity 21 | 22 | :return: notional value 23 | """ 24 | return self._notional 25 | 26 | @property 27 | def count(self) -> int: 28 | """ 29 | Total number of transactions since last TradeTracker.clear(). 30 | 31 | :return: count of transactions 32 | """ 33 | return self._count 34 | 35 | def clear(self) -> None: 36 | """ 37 | Reset the trade values for notional and count to zero (intended to be called 38 | every time step). 39 | 40 | :return: (void) 41 | """ 42 | self._notional = 0. 43 | self._count = 0 44 | 45 | def add(self, notional: float) -> None: 46 | """ 47 | Add a trade's notional value to the cumulative sum and counts of transactions 48 | since last TradeTracker.clear(). 49 | 50 | :param notional: notional value of transaction 51 | :return: (void) 52 | """ 53 | self._notional += notional 54 | self._count += 1 55 | 56 | def remove(self, notional: float) -> None: 57 | """ 58 | Remove a trade's notional value from the cumulative sum and counts of 59 | transactions since last 60 | TradeTracker.clear(). 61 | 62 | :param notional: notional value of transaction 63 | :return: (void) 64 | """ 65 | self._notional -= notional 66 | self._count -= 1 67 | -------------------------------------------------------------------------------- /data_recorder/database/README.md: -------------------------------------------------------------------------------- 1 | # Database 2 | As of December 12, 2019. 3 | 4 | ## 1. Overview 5 | The `database` module contains three files: 6 | - `database.py` a wrapper class for storing tick data from the `Arctic Tick Store`. 7 | - `simulator.py` class to replay and export recorded tick data. 8 | - `viz.py` class to plot exported order book snapshot data from `simulator.py`. 9 | 10 | 11 | ## 2. Classes 12 | 13 | ### 2.1 Database 14 | This is a wrapper class used for storing streaming tick data into the 15 | `Arctic Tick Store`. 16 | 17 | - The `new_tick()` method is used to persist data to Arctic and it is 18 | implemented in both `bifinex_connector` and `coinbase_connector` 19 | projects. 20 | - The `init_db_connection` method establishes a connection with MongoDB. 21 | - The `get_tick_history` method is used to query Arctic and return its 22 | `cursor` in the form of a `pd.DataFrame`; it is implemented in 23 | `database.py`. 24 | 25 | ### 2.2 Simulator 26 | This is a utility class to replay historical data, and export order book 27 | snapshots to a xz-compressed csv. 28 | 29 | An example of how to export LOB snapshots to a csv is below: 30 | 31 | ``` 32 | # Find LTC-USD ticks from Bitmex Pro between April 06, 2019 and April 07, 2019. 33 | query = { 34 | 'ccy': ['XBTUSD'], 35 | 'start_date': 20190406, 36 | 'end_date': 20190407 37 | } 38 | 39 | 40 | # Or, find LTC-USD ticks from Coinbase Pro AND Bitfinex exchange between April 06, 2019 and April 07, 2019. 41 | query = { 42 | 'ccy': ['LTC-USD', 'tLTCUSD'], 43 | 'start_date': 20190406, 44 | 'end_date': 20190407 45 | } 46 | 47 | 48 | sim = Simulator() 49 | sim.extract_features(query) 50 | 51 | # Done ! 52 | ``` 53 | 54 | ### 2.3 Viz 55 | This is a utility class to plot the features data exported from 56 | `simulator.py` 57 | 58 | Example diagrams using 500-millisecond snapshots of ETH-USD's limit 59 | order book are below: 60 | 61 | `plot_lob_overlay` 62 | ![plot_lob_overlay](../../design_patterns/plot_lob_overlay.png) 63 | 64 | `plot_lob_levels` 65 | ![plot_lob_levels](../../design_patterns/plot_lob_levels.png) 66 | 67 | `plot_transactions` 68 | ![plot_transactions](../../design_patterns/plot_transactions.png) 69 | 70 | `plot_order_arrivals` 71 | ![plot_order_arrivals](../../design_patterns/plot_order_arrivals.png) 72 | 73 | 74 | -------------------------------------------------------------------------------- /data_recorder/database/__init__.py: -------------------------------------------------------------------------------- 1 | from . import * 2 | -------------------------------------------------------------------------------- /data_recorder/database/data_exports/demo_LTC-USD_20190926.csv.xz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sadighian/crypto-rl/078081e5715cadeae9c798a3d759c9d59d2041bc/data_recorder/database/data_exports/demo_LTC-USD_20190926.csv.xz -------------------------------------------------------------------------------- /data_recorder/database/database.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime as dt 2 | from typing import Union 3 | 4 | import numpy as np 5 | import pandas as pd 6 | from arctic import Arctic, TICK_STORE 7 | from arctic.date import DateRange 8 | from pymongo.errors import PyMongoError 9 | 10 | from configurations import ( 11 | ARCTIC_NAME, BATCH_SIZE, LOGGER, MONGO_ENDPOINT, RECORD_DATA, TIMEZONE, 12 | ) 13 | 14 | 15 | class Database(object): 16 | 17 | def __init__(self, sym: str, exchange: str, record_data: bool = RECORD_DATA): 18 | """ 19 | Database constructor. 20 | """ 21 | self.counter = 0 22 | self.data = list() 23 | self.tz = TIMEZONE 24 | self.sym = sym 25 | self.exchange = exchange 26 | self.recording = record_data 27 | self.db = self.collection = None 28 | if self.recording: 29 | LOGGER.info('\nDatabase: [%s is recording %s]\n' % (self.exchange, self.sym)) 30 | 31 | def init_db_connection(self) -> None: 32 | """ 33 | Initiate database connection to Arctic. 34 | 35 | :return: (void) 36 | """ 37 | LOGGER.info("init_db_connection for {}...".format(self.sym)) 38 | try: 39 | self.db = Arctic(MONGO_ENDPOINT) 40 | self.db.initialize_library(ARCTIC_NAME, lib_type=TICK_STORE) 41 | self.collection = self.db[ARCTIC_NAME] 42 | except PyMongoError as e: 43 | LOGGER.warn("Database.PyMongoError() --> {}".format(e)) 44 | 45 | def new_tick(self, msg: dict) -> None: 46 | """ 47 | If RECORD_DATA is TRUE, add streaming ticks to a list 48 | After the list has accumulated BATCH_SIZE ticks, insert batch into 49 | the Arctic Tick Store. 50 | 51 | :param msg: incoming tick 52 | :return: void 53 | """ 54 | 55 | if self.recording is False: 56 | return 57 | 58 | self.counter += 1 59 | msg['index'] = dt.now(tz=self.tz) 60 | msg['system_time'] = str(msg['index']) 61 | self.data.append(msg) 62 | if self.counter % BATCH_SIZE == 0: 63 | self.collection.write(self.sym, self.data) 64 | LOGGER.info('{} added {} msgs to Arctic'.format(self.sym, self.counter)) 65 | self.counter = 0 66 | self.data.clear() 67 | 68 | def _query_arctic(self, 69 | ccy: str, 70 | start_date: int, 71 | end_date: int) -> Union[pd.DataFrame, None]: 72 | """ 73 | Query database and return LOB messages starting from LOB reconstruction. 74 | 75 | :param ccy: currency symbol 76 | :param start_date: YYYYMMDD start date 77 | :param end_date: YYYYMMDD end date 78 | :return: (pd.DataFrame) results found in database 79 | """ 80 | assert self.collection is not None, \ 81 | "Arctic.Collection() must not be null." 82 | 83 | start_time = dt.now(tz=self.tz) 84 | 85 | try: 86 | LOGGER.info('\nGetting {} data from Arctic Tick Store...'.format(ccy)) 87 | cursor = self.collection.read(symbol=ccy, 88 | date_range=DateRange(start_date, end_date)) 89 | 90 | # filter ticks for the first LOAD_BOOK message 91 | # (starting point for order book reconstruction) 92 | # min_datetime = cursor.loc[cursor.type == 'load_book'].index[0] 93 | dates = np.unique(cursor.loc[cursor.type == 'load_book'].index.date) 94 | start_index = cursor.loc[((cursor.index.date == dates[0]) & 95 | (cursor.type == 'load_book'))].index[-1] 96 | # cursor = cursor.loc[cursor.index >= min_datetime] 97 | cursor = cursor.loc[cursor.index >= start_index] 98 | 99 | elapsed = (dt.now(tz=self.tz) - start_time).seconds 100 | LOGGER.info('Completed querying %i %s records in %i seconds' % 101 | (cursor.shape[0], ccy, elapsed)) 102 | 103 | except Exception as ex: 104 | cursor = None 105 | LOGGER.warn('Simulator._query_arctic() thew an exception: \n%s' % str(ex)) 106 | 107 | return cursor 108 | 109 | def get_tick_history(self, query: dict) -> Union[pd.DataFrame, None]: 110 | """ 111 | Function to query the Arctic Tick Store and... 112 | 1. Return the specified historical data for a given set of securities 113 | over a specified amount of time 114 | 2. Convert the data returned from the query from a panda to a list of dicts 115 | and while doing so, allocate the work across all available CPU cores 116 | 117 | :param query: (dict) of the query parameters 118 | - ccy: list of symbols 119 | - startDate: int YYYYMMDD start date 120 | - endDate: int YYYYMMDD end date 121 | :return: list of dicts, where each dict is a tick that was recorded 122 | """ 123 | start_time = dt.now(tz=self.tz) 124 | 125 | assert self.recording is False, "RECORD_DATA must be set to FALSE to replay data" 126 | cursor = self._query_arctic(**query) 127 | if cursor is None: 128 | LOGGER.info('\nNothing returned from Arctic for the query: %s\n...Exiting...' 129 | % str(query)) 130 | return 131 | 132 | elapsed = (dt.now(tz=self.tz) - start_time).seconds 133 | LOGGER.info('***Completed get_tick_history() in %i seconds***' % elapsed) 134 | 135 | return cursor 136 | -------------------------------------------------------------------------------- /data_recorder/database/simulator.py: -------------------------------------------------------------------------------- 1 | import os 2 | from datetime import datetime as dt 3 | from datetime import timedelta 4 | from typing import Type, Union 5 | 6 | import numpy as np 7 | import pandas as pd 8 | from dateutil.parser import parse 9 | 10 | from configurations import DATA_PATH, LOGGER, SNAPSHOT_RATE_IN_MICROSECONDS, TIMEZONE 11 | from data_recorder.bitfinex_connector.bitfinex_orderbook import BitfinexOrderBook 12 | from data_recorder.coinbase_connector.coinbase_orderbook import CoinbaseOrderBook 13 | from data_recorder.database.database import Database 14 | 15 | DATA_EXPORTS_PATH = DATA_PATH 16 | 17 | 18 | def _get_exchange_from_symbol(symbol: str) -> str: 19 | """ 20 | Get exchange name given an instrument name. 21 | 22 | :param symbol: instrument name or currency pair 23 | :return: exchange name 24 | """ 25 | symbol_inventory = dict({ 26 | 'BTC-USD': 'coinbase', 27 | 'ETH-USD': 'coinbase', 28 | 'LTC-USD': 'coinbase', 29 | 'tBTCUSD': 'bitfinex', 30 | 'tETHUSD': 'bitfinex', 31 | 'tLTCUSD': 'bitfinex', 32 | 'XBTUSD': 'bitmex', 33 | 'XETUSD': 'bitmex', 34 | }) 35 | return symbol_inventory[symbol] 36 | 37 | 38 | def _get_orderbook_from_exchange(exchange: str) -> \ 39 | Type[Union[CoinbaseOrderBook, BitfinexOrderBook]]: 40 | """ 41 | Get order book given an exchange name. 42 | 43 | :param exchange: name of exchange ['bitfinex' or 'coinbase'] 44 | :return: order book for 'exchange' 45 | """ 46 | return dict(coinbase=CoinbaseOrderBook, bitfinex=BitfinexOrderBook)[exchange] 47 | 48 | 49 | def get_orderbook_from_symbol(symbol: str) -> \ 50 | Type[Union[CoinbaseOrderBook, BitfinexOrderBook]]: 51 | """ 52 | Get order book given an instrument name. 53 | 54 | :param symbol: instrument name 55 | :return: order book for 'symbol' 56 | """ 57 | return _get_orderbook_from_exchange(exchange=_get_exchange_from_symbol(symbol=symbol)) 58 | 59 | 60 | class Simulator(object): 61 | 62 | def __init__(self): 63 | """ 64 | Simulator constructor. 65 | """ 66 | self.cwd = os.path.dirname(os.path.realpath(__file__)) 67 | self.db = Database(sym='None', exchange='None', record_data=False) 68 | 69 | def __str__(self): 70 | return 'Simulator: [ db={} ]'.format(self.db) 71 | 72 | @staticmethod 73 | def export_to_csv(data: pd.DataFrame, 74 | filename: str = 'BTC-USD_2019-01-01', 75 | compress: bool = True) -> None: 76 | """ 77 | Export data within a Panda DataFrame to a csv. 78 | 79 | :param data: (panda.DataFrame) historical tick data 80 | :param filename: CCY_YYYY-MM-DD 81 | :param compress: Default True. If True, compress with xz 82 | """ 83 | start_time = dt.now(tz=TIMEZONE) 84 | 85 | sub_folder = os.path.join(DATA_PATH, filename) + '.csv' 86 | 87 | if compress: 88 | sub_folder += '.xz' 89 | data.to_csv(path_or_buf=sub_folder, index=False, compression='xz') 90 | else: 91 | data.to_csv(path_or_buf=sub_folder, index=False) 92 | 93 | elapsed = (dt.now(tz=TIMEZONE) - start_time).seconds 94 | LOGGER.info('Exported %s with %i rows in %i seconds' % 95 | (sub_folder, data.shape[0], elapsed)) 96 | 97 | @staticmethod 98 | def get_ema_labels(features_list: list, ema_list: list, include_system_time: bool): 99 | """ 100 | Get a list of column labels for EMA values in a list. 101 | """ 102 | assert isinstance(ema_list, list) is True, \ 103 | "Error: EMA_LIST must be a list data type, not {}".format(type(ema_list)) 104 | 105 | ema_labels = list() 106 | 107 | for ema in ema_list: 108 | for col in features_list: 109 | if col == 'system_time': 110 | continue 111 | ema_labels.append('{}_{}'.format(col, ema)) 112 | 113 | if include_system_time: 114 | ema_labels.insert(0, 'system_time') 115 | 116 | return ema_labels 117 | 118 | @staticmethod 119 | def _get_microsecond_delta(new_tick_time: dt, last_snapshot_time: dt) -> int: 120 | """ 121 | Calculate difference between two consecutive ticks. 122 | 123 | Note: only tracks timedelta for up to a minute. 124 | 125 | :param new_tick_time: datetime of incoming tick 126 | :param last_snapshot_time: datetime of last LOB snapshot 127 | :return: (int) delta between ticks 128 | """ 129 | 130 | if last_snapshot_time > new_tick_time: 131 | return -1 132 | 133 | snapshot_tick_time_delta = new_tick_time - last_snapshot_time 134 | seconds = snapshot_tick_time_delta.seconds * 1000000 135 | microseconds = snapshot_tick_time_delta.microseconds 136 | 137 | return seconds + microseconds 138 | 139 | def get_orderbook_snapshot_history(self, query: dict) -> pd.DataFrame or None: 140 | """ 141 | Function to replay historical market data and generate the features used for 142 | reinforcement learning & training. 143 | 144 | NOTE: 145 | The query can either be a single Coinbase CCY, or both Coinbase and Bitfinex, 146 | but it cannot be only a Bitfinex CCY. Later releases of this repo will 147 | support Bitfinex only order book reconstruction. 148 | 149 | :param query: (dict) query for finding tick history in Arctic TickStore 150 | :return: (pd.DataFrame) snapshots of limit order books using a 151 | stationary feature set 152 | """ 153 | self.db.init_db_connection() 154 | 155 | tick_history = self.db.get_tick_history(query=query) 156 | if tick_history is None: 157 | LOGGER.warn("Query returned no data: {}".format(query)) 158 | return None 159 | 160 | loop_length = tick_history.shape[0] 161 | 162 | # number of microseconds between LOB snapshots 163 | snapshot_interval_milliseconds = SNAPSHOT_RATE_IN_MICROSECONDS // 1000 164 | 165 | snapshot_list = list() 166 | last_snapshot_time = None 167 | tick_types_for_warm_up = {'load_book', 'book_loaded', 'preload'} 168 | 169 | instrument_name = query['ccy'][0] 170 | assert isinstance(instrument_name, str), \ 171 | "Error: instrument_name must be a string, not -> {}".format( 172 | type(instrument_name)) 173 | 174 | LOGGER.info('querying {}'.format(instrument_name)) 175 | 176 | order_book = get_orderbook_from_symbol(symbol=instrument_name)( 177 | sym=instrument_name) 178 | 179 | start_time = dt.now(tz=TIMEZONE) 180 | LOGGER.info('Starting get_orderbook_snapshot_history() loop with %i ticks for %s' 181 | % (loop_length, query['ccy'])) 182 | 183 | # loop through all ticks returned from the Arctic Tick Store query. 184 | for count, tx in enumerate(tick_history.itertuples()): 185 | 186 | # periodically print number of steps completed 187 | if count % 250000 == 0: 188 | elapsed = (dt.now(tz=TIMEZONE) - start_time).seconds 189 | LOGGER.info('...completed %i loops in %i seconds' % (count, elapsed)) 190 | 191 | # convert to dictionary for processing 192 | tick = tx._asdict() 193 | 194 | # filter out bad ticks 195 | if 'type' not in tick: 196 | continue 197 | 198 | # flags for a order book reset 199 | if tick['type'] in tick_types_for_warm_up: 200 | order_book.new_tick(msg=tick) 201 | continue 202 | 203 | # check if the LOB is pre-loaded, if not skip message and do NOT process. 204 | if order_book.done_warming_up is False: 205 | LOGGER.info( 206 | "{} order book is not done warming up: {}".format( 207 | instrument_name, tick)) 208 | continue 209 | 210 | # timestamp for incoming tick 211 | new_tick_time = parse(tick.get('system_time')) 212 | 213 | # remove ticks without timestamps (should not exist/happen) 214 | if new_tick_time is None: 215 | LOGGER.info('No tick time: {}'.format(tick)) 216 | continue 217 | 218 | # initialize the LOB snapshot timer 219 | if last_snapshot_time is None: 220 | # process first ticks and check if they're stale ticks; if so, 221 | # skip to the next loop. 222 | order_book.new_tick(tick) 223 | 224 | last_tick_time = order_book.last_tick_time 225 | if last_tick_time is None: 226 | continue 227 | 228 | last_tick_time_dt = parse(last_tick_time) 229 | last_snapshot_time = last_tick_time_dt 230 | LOGGER.info('{} first tick: {} '.format(order_book.sym, new_tick_time)) 231 | # skip to next loop 232 | continue 233 | 234 | # calculate the amount of time between the incoming 235 | # tick and tick received before that 236 | diff = self._get_microsecond_delta(new_tick_time, last_snapshot_time) 237 | 238 | # update the LOB, but do not take a LOB snapshot if the tick time is 239 | # out of sequence. This occurs when pre-loading a LOB with stale tick 240 | # times in general. 241 | if diff == -1: 242 | order_book.new_tick(msg=tick) 243 | continue 244 | 245 | # derive the number of LOB snapshot insertions for the data buffer. 246 | multiple = diff // SNAPSHOT_RATE_IN_MICROSECONDS # 1000000 is 1 second 247 | 248 | # proceed if we have one or more insertions to make 249 | if multiple <= 0: 250 | order_book.new_tick(msg=tick) 251 | continue 252 | 253 | order_book_snapshot = order_book.render_book() 254 | for i in range(multiple): 255 | last_snapshot_time += timedelta( 256 | milliseconds=snapshot_interval_milliseconds) 257 | snapshot_list.append(np.hstack((last_snapshot_time, order_book_snapshot))) 258 | 259 | # update order book with most recent tick now, so the snapshots 260 | # are up to date for the next iteration of the loop. 261 | order_book.new_tick(msg=tick) 262 | continue 263 | 264 | elapsed = max((dt.now(tz=TIMEZONE) - start_time).seconds, 1) 265 | LOGGER.info('Completed run_simulation() with %i ticks in %i seconds ' 266 | 'at %i ticks/second' 267 | % (loop_length, elapsed, loop_length // elapsed)) 268 | 269 | orderbook_snapshot_history = pd.DataFrame( 270 | data=snapshot_list, 271 | columns=['system_time'] + order_book.render_lob_feature_names() 272 | ) 273 | 274 | # remove NAs from data set (and print the amount) 275 | before_shape = orderbook_snapshot_history.shape[0] 276 | orderbook_snapshot_history = orderbook_snapshot_history.dropna(axis=0) 277 | difference_in_records = orderbook_snapshot_history.shape[0] - before_shape 278 | LOGGER.info("{} {} rows due to NA values".format( 279 | 'Dropping' if difference_in_records <= 0 else 'Adding', 280 | abs(difference_in_records)) 281 | ) 282 | 283 | return orderbook_snapshot_history 284 | 285 | def extract_features(self, query: dict) -> None: 286 | """ 287 | Create and export limit order book data to csv. This function 288 | exports multiple days of data and ensures each day starts and 289 | ends exactly on time. 290 | 291 | :param query: (dict) ccy=sym, daterange=(YYYYMMDD,YYYYMMDD) 292 | :return: void 293 | """ 294 | start_time = dt.now(tz=TIMEZONE) 295 | 296 | order_book_data = self.get_orderbook_snapshot_history(query=query) 297 | if order_book_data is not None: 298 | dates = order_book_data['system_time'].dt.date.unique() 299 | LOGGER.info('dates: {}'.format(dates)) 300 | for date in dates[:]: 301 | # for date in dates[1:]: 302 | tmp = order_book_data.loc[order_book_data['system_time'].dt.date == date] 303 | self.export_to_csv( 304 | tmp, filename='{}_{}'.format(query['ccy'][0], date), compress=True) 305 | 306 | elapsed = (dt.now(tz=TIMEZONE) - start_time).seconds 307 | LOGGER.info('***\nSimulator.extract_features() executed in %i seconds\n***' 308 | % elapsed) 309 | -------------------------------------------------------------------------------- /data_recorder/database/viz.py: -------------------------------------------------------------------------------- 1 | import matplotlib.cm as cm 2 | import numpy as np 3 | import pandas as pd 4 | from matplotlib import pyplot as plt 5 | 6 | 7 | def plot_lob_overlay(data: pd.DataFrame, window=1, levels=range(15)) -> None: 8 | """ 9 | Plot limit order book midpoint prices on the first axis, and the price levels 10 | within the LOB on a second axis, centered around the midpoint. 11 | 12 | :param data: LOB snapshot export in form of a DataFrame 13 | :param window: rolling look-back period to smooth LOB price levels 14 | :param levels: a list of levels to render in the plot 15 | :return: (void) 16 | """ 17 | 18 | def ma(a, n=window): 19 | if isinstance(a, list) or isinstance(a, np.ndarray): 20 | a = pd.DataFrame(a) 21 | return a.rolling(n).mean().values 22 | 23 | midpoint_prices = data['midpoint'].values 24 | 25 | colors = cm.rainbow(np.linspace(0, 1, len(levels))) 26 | 27 | fig, ax1 = plt.subplots(figsize=(16, 6)) 28 | ax1.set_title("Midpoint Prices vs. Stationary Limit Order Book Levels") 29 | 30 | ax1.plot(midpoint_prices, label='midpoint', color='b') 31 | 32 | ax2 = ax1.twinx() 33 | 34 | for c, level in zip(colors, levels): 35 | ax2.plot(ma(data['bid_distance_{}'.format(level)].values, n=window), 36 | linestyle='--', label='Bid-{}'.format(level), color=c, alpha=0.5) 37 | ax2.plot(ma(data['ask_distance_{}'.format(level)].values, n=window), 38 | linestyle='--', label='Ask-{}'.format(level), color=c, alpha=0.5) 39 | 40 | ax2.axhline(0, color='y', linestyle='--') 41 | ax2.set_ylabel('Level Differences', color='y') 42 | ax2.tick_params('y', colors='y') 43 | 44 | ax1.set_xlabel("Time step (each tick is 1-second)") 45 | ax1.set_ylabel("Price (USD)") 46 | fig.legend() 47 | plt.show() 48 | 49 | 50 | def _get_transaction_plot_values(data: pd.DataFrame) -> ( 51 | np.ndarray, np.ndarray, pd.DataFrame): 52 | """ 53 | Helper function to prepare transaction data for plotting. 54 | 55 | :param data: LOB snapshot export in form of a DataFrame 56 | :return: bid, ask, and transaction data for plotting 57 | """ 58 | nbbo = data[['midpoint', 'bid_distance_0', 'ask_distance_0']].values 59 | bids = nbbo[:, 0] * (nbbo[:, 1] + 1.) 60 | asks = nbbo[:, 0] * (nbbo[:, 2] + 1.) 61 | 62 | transactions = data[['buys', 'sells']].copy() 63 | transactions /= transactions 64 | transactions = transactions.fillna(0.) 65 | 66 | transactions['buys'] = asks * transactions['buys'].values 67 | transactions['sells'] = bids * transactions['sells'].values 68 | 69 | transactions.loc[ 70 | (transactions['buys'] == 0.), ['buys']] = np.nan 71 | transactions.loc[ 72 | (transactions['sells'] == 0.), ['sells']] = np.nan 73 | 74 | return bids, asks, transactions 75 | 76 | 77 | def plot_transactions(data: pd.DataFrame) -> None: 78 | """ 79 | Plot midpoint prices with buy and sell transactions dotted on the same plotting axis. 80 | 81 | :param data: LOB snapshot export in form of a DataFrame 82 | :return: (void) 83 | """ 84 | bids, asks, transactions = _get_transaction_plot_values(data) 85 | 86 | fig, ax1 = plt.subplots(figsize=(16, 6)) 87 | 88 | ax1.plot(bids, label='bids', color='g', alpha=0.5) 89 | ax1.plot(asks, label='asks', color='r', alpha=0.5) 90 | 91 | ax1.set_xlabel("Time step (each tick is 1-second)") 92 | ax1.set_ylabel("Price (USD)") 93 | ax1.set_title("Bid vs. Ask spread with Buy and Sell executions") 94 | 95 | x = list(range(transactions.shape[0])) 96 | ax1.scatter(x, transactions['buys'], label='buys', c='g') 97 | ax1.scatter(x, transactions['sells'], label='sells', c='r') 98 | 99 | fig.legend() 100 | plt.show() 101 | 102 | 103 | def plot_lob_levels(data: pd.DataFrame, window: int = 1, levels: list = range(1, 15), 104 | include_transactions: bool = True) -> None: 105 | """ 106 | Plot limit order book midpoint prices on the first axis, and the percentage 107 | distances for each price level within the LOB on a second axis, centered around the 108 | midpoint. 109 | 110 | :param data: LOB snapshot export in form of a DataFrame 111 | :param window: rolling look-back period to smooth LOB price levels 112 | :param levels: a list of levels to render in the plot 113 | :param include_transactions: if TRUE, plot transactions 114 | :return: (void) 115 | """ 116 | 117 | def ma(a, n=window): 118 | if isinstance(a, list) or isinstance(a, np.ndarray): 119 | a = pd.DataFrame(a) 120 | return a.rolling(n).mean().values 121 | 122 | midpoint_prices = data['midpoint'].values 123 | bids, asks, transactions = _get_transaction_plot_values(data) 124 | colors = cm.rainbow(np.linspace(0, 1, len(levels))) 125 | 126 | fig, ax1 = plt.subplots(figsize=(16, 6)) 127 | ax1.set_title("Bid-Ask Prices vs. Fixed Limit Order Book Levels") 128 | 129 | ax1.plot(bids, label='bids', color='g', alpha=0.5) 130 | ax1.plot(asks, label='asks', color='r', alpha=0.5) 131 | 132 | for i, (c, level) in enumerate(zip(colors, levels)): 133 | bid_level_data = data['bid_distance_{}'.format(level)].values + 1 134 | ax1.plot(ma(bid_level_data * midpoint_prices, window), linestyle='--', 135 | label='Bid-{}'.format(level), color=c, alpha=0.5) 136 | ask_level_data = data['ask_distance_{}'.format(level)].values + 1 137 | ax1.plot(ma(ask_level_data * midpoint_prices, window), linestyle='--', 138 | label='Ask-{}'.format(level), color=c) 139 | 140 | if include_transactions: 141 | x = list(range(transactions.shape[0])) 142 | ax1.scatter(x, transactions['buys'], label='buys', c='g') 143 | ax1.scatter(x, transactions['sells'], label='sells', c='r') 144 | 145 | ax1.set_xlabel("Time step (each tick is 1-second)") 146 | ax1.set_ylabel("Price (USD)") 147 | fig.legend() 148 | plt.show() 149 | 150 | 151 | def plot_order_arrivals(data: pd.DataFrame, level: int = 0) -> None: 152 | """ 153 | Plot midpoint prices on the first axis, and OFI on the second access. 154 | 155 | :param data: LOB snapshot export in form of a DataFrame 156 | :param level: price level number to render in the plot 157 | :return: (void) 158 | """ 159 | fig, ax1 = plt.subplots(figsize=(16, 6)) 160 | 161 | bids, asks, transactions = _get_transaction_plot_values(data) 162 | ax1.plot(bids, label='bids', color='g', alpha=0.5) 163 | ax1.plot(asks, label='asks', color='r', alpha=0.5) 164 | x_axis = list(range(transactions.shape[0])) 165 | ax1.scatter(x_axis, transactions['buys'], label='buys', c='g') 166 | ax1.scatter(x_axis, transactions['sells'], label='sells', c='r') 167 | 168 | ax2 = ax1.twinx() 169 | cancel_arrivals = data['bid_cancel_notional_{}'.format(level)] - \ 170 | data['ask_cancel_notional_0'] 171 | limit_arrivals = data['bid_limit_notional_{}'.format(level)] - \ 172 | data['ask_limit_notional_0'] 173 | market_arrivals = data['bid_market_notional_{}'.format(level)] - \ 174 | data['ask_market_notional_0'] 175 | 176 | ofi = limit_arrivals - cancel_arrivals - market_arrivals 177 | ax2.bar(x_axis, ofi, label='ofi #{}'.format(level), color='orange', alpha=0.6) 178 | ax2.axhline(0., color='black', alpha=0.2, linestyle='-.') 179 | 180 | ax1.set_xlabel("Time step (each tick is 1-second)") 181 | ax1.set_ylabel("Price (USD)") 182 | ax2.set_ylabel("Notional Value (USD)") 183 | ax1.set_title("Midpoint price vs. Order Arrival Notional Values") 184 | 185 | fig.legend() 186 | plt.show() 187 | -------------------------------------------------------------------------------- /data_recorder/tests/test_bitfinex_client.py: -------------------------------------------------------------------------------- 1 | import asyncio 2 | 3 | from data_recorder.bitfinex_connector.bitfinex_client import BitfinexClient 4 | 5 | if __name__ == "__main__": 6 | """ 7 | This __main__ function is used for testing the 8 | BitfinexClient class in isolation. 9 | """ 10 | symbols = ['tBTCUSD'] # 'tETHUSD', 'tLTCUSD'] 11 | print('Initializing...%s' % symbols) 12 | loop = asyncio.get_event_loop() 13 | p = dict() 14 | 15 | for sym in symbols: 16 | p[sym] = BitfinexClient(sym=sym) 17 | p[sym].start() 18 | print('Started thread for %s' % sym) 19 | 20 | tasks = asyncio.gather(*[(p[sym].subscribe()) for sym in symbols]) 21 | print('Gathered %i tasks' % len(symbols)) 22 | 23 | try: 24 | loop.run_until_complete(tasks) 25 | for sym in symbols: 26 | p[sym].join() 27 | print('Closing [%s]' % p[sym].name) 28 | print('loop closed.') 29 | 30 | except KeyboardInterrupt as e: 31 | print("Caught keyboard interrupt. Canceling tasks...") 32 | tasks.cancel() 33 | loop.close() 34 | for sym in symbols: 35 | p[sym].join() 36 | print('Closing [%s]' % p[sym].name) 37 | 38 | finally: 39 | loop.close() 40 | print('\nFinally done.') 41 | -------------------------------------------------------------------------------- /data_recorder/tests/test_coinbase_client.py: -------------------------------------------------------------------------------- 1 | import asyncio 2 | 3 | from data_recorder.coinbase_connector.coinbase_client import CoinbaseClient 4 | 5 | if __name__ == "__main__": 6 | """ 7 | This __main__ function is used for testing the 8 | CoinbaseClient class in isolation. 9 | """ 10 | 11 | loop = asyncio.get_event_loop() 12 | symbols = ['BTC-USD'] # 'LTC-USD', 'ETH-USD'] 13 | p = dict() 14 | 15 | print('Initializing...%s' % symbols) 16 | for sym in symbols: 17 | p[sym] = CoinbaseClient(sym=sym) 18 | p[sym].start() 19 | 20 | tasks = asyncio.gather(*[(p[sym].subscribe()) for sym in symbols]) 21 | print('Gathered %i tasks' % len(symbols)) 22 | 23 | try: 24 | loop.run_until_complete(tasks) 25 | print('TASK are complete for {}'.format(symbols)) 26 | loop.close() 27 | for sym in symbols: 28 | p[sym].join() 29 | print('Closing [%s]' % p[sym].name) 30 | print('loop closed.') 31 | 32 | except KeyboardInterrupt as e: 33 | print("Caught keyboard interrupt. Canceling tasks...") 34 | tasks.cancel() 35 | loop.close() 36 | for sym in symbols: 37 | p[sym].join() 38 | print('Closing [%s]' % p[sym].name) 39 | 40 | finally: 41 | loop.close() 42 | print('\nFinally done.') 43 | -------------------------------------------------------------------------------- /data_recorder/tests/test_simulator.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime as dt 2 | 3 | from configurations import TIMEZONE 4 | from data_recorder.database.simulator import Simulator 5 | 6 | 7 | def test_get_tick_history() -> None: 8 | """ 9 | Test case to query Arctic TickStore 10 | """ 11 | start_time = dt.now(tz=TIMEZONE) 12 | 13 | sim = Simulator() 14 | query = { 15 | 'ccy': ['BTC-USD'], 16 | 'start_date': 20181231, 17 | 'end_date': 20190102 18 | } 19 | tick_history = sim.db.get_tick_history(query=query) 20 | print('\n{}\n'.format(tick_history)) 21 | 22 | elapsed = (dt.now(tz=TIMEZONE) - start_time).seconds 23 | print('Completed %s in %i seconds' % (__name__, elapsed)) 24 | print('DONE. EXITING %s' % __name__) 25 | 26 | 27 | def test_get_orderbook_snapshot_history() -> None: 28 | """ 29 | Test case to export testing/training data for reinforcement learning 30 | """ 31 | start_time = dt.now(tz=TIMEZONE) 32 | 33 | sim = Simulator() 34 | query = { 35 | 'ccy': ['LTC-USD'], 36 | 'start_date': 20190926, 37 | 'end_date': 20190928 38 | } 39 | orderbook_snapshot_history = sim.get_orderbook_snapshot_history(query=query) 40 | if orderbook_snapshot_history is None: 41 | print('Exiting: orderbook_snapshot_history is NONE') 42 | return 43 | 44 | filename = 'test_' + '{}_{}'.format(query['ccy'][0], query['start_date']) 45 | sim.export_to_csv(data=orderbook_snapshot_history, 46 | filename=filename, compress=False) 47 | 48 | elapsed = (dt.now(tz=TIMEZONE) - start_time).seconds 49 | print('Completed %s in %i seconds' % (__name__, elapsed)) 50 | print('DONE. EXITING %s' % __name__) 51 | 52 | 53 | def test_extract_features() -> None: 54 | """ 55 | Test case to export *multiple* testing/training data sets for reinforcement learning 56 | """ 57 | start_time = dt.now(tz=TIMEZONE) 58 | 59 | sim = Simulator() 60 | 61 | for ccy in ['ETH-USD']: 62 | # for ccy, ccy2 in [('LTC-USD', 'tLTCUSD')]: 63 | query = { 64 | 'ccy': [ccy], # ccy2], # parameter must be a list 65 | 'start_date': 20191208, # parameter format for dates 66 | 'end_date': 20191209, # parameter format for dates 67 | } 68 | sim.extract_features(query) 69 | 70 | elapsed = (dt.now(tz=TIMEZONE) - start_time).seconds 71 | print('Completed %s in %i seconds' % (__name__, elapsed)) 72 | print('DONE. EXITING %s' % __name__) 73 | 74 | 75 | if __name__ == '__main__': 76 | """ 77 | Entry point of tests application 78 | """ 79 | # test_get_tick_history() 80 | test_get_orderbook_snapshot_history() 81 | # test_extract_features() 82 | -------------------------------------------------------------------------------- /design_patterns/design-pattern-high-level.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sadighian/crypto-rl/078081e5715cadeae9c798a3d759c9d59d2041bc/design_patterns/design-pattern-high-level.PNG -------------------------------------------------------------------------------- /design_patterns/design-pattern.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sadighian/crypto-rl/078081e5715cadeae9c798a3d759c9d59d2041bc/design_patterns/design-pattern.png -------------------------------------------------------------------------------- /design_patterns/plot_lob_levels.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sadighian/crypto-rl/078081e5715cadeae9c798a3d759c9d59d2041bc/design_patterns/plot_lob_levels.png -------------------------------------------------------------------------------- /design_patterns/plot_lob_overlay.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sadighian/crypto-rl/078081e5715cadeae9c798a3d759c9d59d2041bc/design_patterns/plot_lob_overlay.png -------------------------------------------------------------------------------- /design_patterns/plot_order_arrivals.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sadighian/crypto-rl/078081e5715cadeae9c798a3d759c9d59d2041bc/design_patterns/plot_order_arrivals.png -------------------------------------------------------------------------------- /design_patterns/plot_transactions.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sadighian/crypto-rl/078081e5715cadeae9c798a3d759c9d59d2041bc/design_patterns/plot_transactions.png -------------------------------------------------------------------------------- /experiment.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | 3 | from agent.dqn import Agent 4 | from configurations import LOGGER 5 | 6 | parser = argparse.ArgumentParser() 7 | parser.add_argument('--window_size', 8 | default=100, 9 | help="Number of lags to include in the observation", 10 | type=int) 11 | parser.add_argument('--max_position', 12 | default=5, 13 | help="Maximum number of positions that are " + 14 | "able to be held in a broker's inventory", 15 | type=int) 16 | parser.add_argument('--fitting_file', 17 | default='demo_LTC-USD_20190926.csv.xz', 18 | help="Data set for fitting the z-score scaler (previous day)", 19 | type=str) 20 | parser.add_argument('--testing_file', 21 | default='demo_LTC-USD_20190926.csv.xz', 22 | help="Data set for training the agent (current day)", 23 | type=str) 24 | parser.add_argument('--symbol', 25 | default='LTC-USD', 26 | help="Name of currency pair or instrument", 27 | type=str) 28 | parser.add_argument('--id', 29 | # default='market-maker-v0', 30 | default='trend-following-v0', 31 | help="Environment ID; Either 'trend-following-v0' or " 32 | "'market-maker-v0'", 33 | type=str) 34 | parser.add_argument('--number_of_training_steps', 35 | default=1e5, 36 | help="Number of steps to train the agent " 37 | "(does not include action repeats)", 38 | type=int) 39 | parser.add_argument('--gamma', 40 | default=0.99, 41 | help="Discount for future rewards", 42 | type=float) 43 | parser.add_argument('--seed', 44 | default=1, 45 | help="Random number seed for data set", 46 | type=int) 47 | parser.add_argument('--action_repeats', 48 | default=5, 49 | help="Number of steps to pass on between actions", 50 | type=int) 51 | parser.add_argument('--load_weights', 52 | default=False, 53 | help="Load saved load_weights if TRUE, otherwise start from scratch", 54 | type=bool) 55 | parser.add_argument('--visualize', 56 | default=False, 57 | help="Render midpoint on a screen", 58 | type=bool) 59 | parser.add_argument('--training', 60 | default=True, 61 | help="Training or testing mode. " + 62 | "If TRUE, then agent starts learning, " + 63 | "If FALSE, then agent is tested", 64 | type=bool) 65 | parser.add_argument('--reward_type', 66 | default='default', 67 | choices=['default', 68 | 'default_with_fills', 69 | 'realized_pnl', 70 | 'differential_sharpe_ratio', 71 | 'asymmetrical', 72 | 'trade_completion'], 73 | help=""" 74 | reward_type: method for calculating the environment's reward: 75 | 1) 'default' --> inventory count * change in midpoint price returns 76 | 2) 'default_with_fills' --> inventory count * change in midpoint 77 | price returns + closed trade PnL 78 | 3) 'realized_pnl' --> change in realized pnl between time steps 79 | 4) 'differential_sharpe_ratio' --> 80 | http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1.7210&rep=rep1 81 | &type=pdf 82 | 5) 'asymmetrical' --> extended version of *default* and enhanced 83 | with a reward for being filled above or below midpoint, 84 | and returns only negative rewards for Unrealized PnL to discourage 85 | long-term speculation. 86 | 6) 'trade_completion' --> reward is generated per trade's round trip 87 | """, 88 | type=str) 89 | parser.add_argument('--nn_type', 90 | default='cnn', 91 | help="Type of neural network to use: 'cnn' or 'mlp' ", 92 | type=str) 93 | parser.add_argument('--dueling_network', 94 | default=True, 95 | help="If TRUE, use Dueling architecture in DQN", 96 | type=bool) 97 | parser.add_argument('--double_dqn', 98 | default=True, 99 | help="If TRUE, use double DQN for Q-value estimation", 100 | type=bool) 101 | args = vars(parser.parse_args()) 102 | 103 | 104 | def main(kwargs): 105 | LOGGER.info(f'Experiment creating agent with kwargs: {kwargs}') 106 | agent = Agent(**kwargs) 107 | LOGGER.info(f'Agent created. {agent}') 108 | agent.start() 109 | 110 | 111 | if __name__ == '__main__': 112 | main(kwargs=args) 113 | -------------------------------------------------------------------------------- /gym_trading/README.md: -------------------------------------------------------------------------------- 1 | # GYM_TRADING 2 | As of December 12, 2019. 3 | 4 | ## Overview 5 | This package is my implementation of an HFT environment extending a 6 | POMDP framework from OpenAI. 7 | 8 | ## 1. Envs 9 | Module containing various environment implementations. 10 | - `trend_following.py` implementation where the agent uses **market orders** 11 | to trade crytocurrencies. 12 | - `market_maker.py` implementation where the agent uses **limit orders** 13 | to trade cryptocurrencies 14 | 15 | ## 2. Utils 16 | Module containing utility classes for the `gym_trading` module. 17 | - `broker.py`, `position.py`, and `order.py` manages orders & 18 | executions, position inventories, and PnL calculations for `envs` 19 | - `render_env.py` renders midpoint price data as the agent steps 20 | through the environment 21 | - `plot_history.py` renders environment observations and PnL 22 | - `reward.py` contains the reward functions 23 | - `statistics` contains trackers for risk, rewards, etc. 24 | 25 | ## 3. Tests 26 | Module containing test cases for `gym_trading`'s modules. 27 | -------------------------------------------------------------------------------- /gym_trading/__init__.py: -------------------------------------------------------------------------------- 1 | from gym.envs.registration import register 2 | 3 | from gym_trading.envs.market_maker import MarketMaker 4 | from gym_trading.envs.trend_following import TrendFollowing 5 | 6 | register( 7 | id=TrendFollowing.id, 8 | entry_point='gym_trading.envs:TrendFollowing', 9 | max_episode_steps=1000000, 10 | nondeterministic=False 11 | ) 12 | 13 | register( 14 | id=MarketMaker.id, 15 | entry_point='gym_trading.envs:MarketMaker', 16 | max_episode_steps=1000000, 17 | nondeterministic=False 18 | ) 19 | -------------------------------------------------------------------------------- /gym_trading/envs/README.md: -------------------------------------------------------------------------------- 1 | # Environments 2 | As of December 12, 2019. 3 | 4 | ## 1. Overview 5 | Each python file is an extension to OpenAI's GYM module 6 | with all mandatory abstract methods are implemented. 7 | 8 | ## 2. Environments 9 | 10 | ### 2.1 base_env.py 11 | - Base class environment to serve as a platform for extending 12 | - Includes all repetitive functions required by environments: (1) data 13 | loading and pre-processing, (2) broker to act as counter-party, and (3) 14 | other attributes 15 | - `rewards` can be derived via several different approaches: 16 | 1) 'default' --> inventory count * change in midpoint price returns 17 | 2) 'default_with_fills' --> inventory count * change in midpoint price returns + closed trade 18 | PnL 19 | 3) 'realized_pnl' --> change in realized pnl between time steps 20 | 4) 'differential_sharpe_ratio' --> 21 | http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1.7210&rep=rep1&type=pdf 22 | 5) 'asymmetrical' --> extended version of *default* and enhanced with a 23 | reward for being filled above or below midpoint, and returns only 24 | negative rewards for Unrealized PnL to discourage long-term 25 | speculation. 26 | 6) 'trade_completion' --> reward is generated per trade's round trip 27 | 28 | - `observation space` is normalized via z-score; outliers above +/-10 are clipped. 29 | - The position management and PnL calculator are handled by the 30 | `../broker.py` class in FIFO order 31 | - The historical data is loaded using `../gym_trading/utils/data_pipeline.py` 32 | class 33 | 34 | ### 2.2 trend_following.py 35 | - This environment is designed for MARKET orders only with the objective 36 | being able to identify a "price jump" or directional movement 37 | - Rewards in this environment are realized PnL in FIFO order 38 | (i.e., current midpoint) 39 | - The `../agent/dqn.py` Agent implements this class 40 | 41 | ### 2.3 market_maker.py 42 | - This environment is designed for LIMIT orders only with the objective 43 | being able profit from market making 44 | - Rewards in this environment are realized PnL in FIFO order 45 | - If there are partial executions, the average execution price is used 46 | to determine PnL 47 | - The `../agent/dqn.py` Agent implements this class -------------------------------------------------------------------------------- /gym_trading/envs/__init__.py: -------------------------------------------------------------------------------- 1 | from gym_trading.envs.market_maker import MarketMaker 2 | from gym_trading.envs.trend_following import TrendFollowing 3 | 4 | 5 | def test_env_loop(env) -> bool: 6 | """ 7 | Evaluate a RL agent 8 | """ 9 | total_reward = 0.0 10 | reward_list = [] 11 | actions_tracker = dict() 12 | 13 | i = 0 14 | done = False 15 | env.reset() 16 | while not done: 17 | i += 1 18 | 19 | action = env.action_space.sample() 20 | 21 | state, reward, done, _ = env.step(action) 22 | total_reward += reward 23 | reward_list.append(reward) 24 | 25 | if action in actions_tracker: 26 | actions_tracker[action] += 1 27 | else: 28 | actions_tracker[action] = 1 29 | 30 | if done: 31 | print(f"Max reward: {max(reward_list)}\nMin reward: {min(reward_list)}") 32 | print(f"Agent completed {env.broker.total_trade_count} trades") 33 | # Visualize results 34 | env.plot_observation_history() 35 | env.plot_trade_history() 36 | break 37 | 38 | print(f"Total reward: {total_reward}") 39 | for action, count in actions_tracker.items(): 40 | print(f"Action #{action} =\t{count}") 41 | 42 | return done 43 | -------------------------------------------------------------------------------- /gym_trading/envs/market_maker.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | from gym import spaces 3 | from typing import Tuple 4 | 5 | from configurations import ENCOURAGEMENT 6 | from gym_trading.envs.base_environment import BaseEnvironment 7 | from gym_trading.utils.order import LimitOrder 8 | 9 | 10 | class MarketMaker(BaseEnvironment): 11 | id = 'market-maker-v0' 12 | description = "Environment where limit orders are tethered to LOB price levels" 13 | 14 | def __init__(self, **kwargs): 15 | """ 16 | Environment designed for automated market making. 17 | 18 | :param kwargs: refer to BaseEnvironment.py 19 | """ 20 | super().__init__(**kwargs) 21 | 22 | # Environment attributes to override in sub-class 23 | self.actions = np.eye(17, dtype=np.float32) 24 | 25 | self.action_space = spaces.Discrete(len(self.actions)) 26 | self.observation = self.reset() # Reset to load observation.shape 27 | self.observation_space = spaces.Box(low=-10., high=10., 28 | shape=self.observation.shape, 29 | dtype=np.float32) 30 | 31 | # Add the remaining labels for the observation space 32 | self.viz.observation_labels += ['Long Dist', 'Short Dist', 33 | 'Bid Completion Ratio', 'Ask Completion Ratio'] 34 | self.viz.observation_labels += [f'Action #{a}' for a in range(len(self.actions))] 35 | self.viz.observation_labels += ['Reward'] 36 | 37 | print('{} {} #{} instantiated\nobservation_space: {}'.format( 38 | MarketMaker.id, self.symbol, self._seed, self.observation_space.shape), 39 | 'reward_type = {}'.format(self.reward_type.upper()), 'max_steps = {}'.format( 40 | self.max_steps)) 41 | 42 | def __str__(self): 43 | return '{} | {}-{}'.format(MarketMaker.id, self.symbol, self._seed) 44 | 45 | def map_action_to_broker(self, action: int) -> Tuple[float, float]: 46 | """ 47 | Create or adjust orders per a specified action and adjust for penalties. 48 | 49 | :param action: (int) current step's action 50 | :return: (float) reward 51 | """ 52 | action_penalty = pnl = 0.0 53 | 54 | if action == 0: # do nothing 55 | action_penalty += ENCOURAGEMENT 56 | 57 | elif action == 1: 58 | action_penalty += self._create_order_at_level(level=0, side='long') 59 | action_penalty += self._create_order_at_level(level=4, side='short') 60 | 61 | elif action == 2: 62 | action_penalty += self._create_order_at_level(level=0, side='long') 63 | action_penalty += self._create_order_at_level(level=9, side='short') 64 | 65 | elif action == 3: 66 | action_penalty += self._create_order_at_level(level=0, side='long') 67 | action_penalty += self._create_order_at_level(level=14, side='short') 68 | 69 | elif action == 4: 70 | action_penalty += self._create_order_at_level(level=4, side='long') 71 | action_penalty += self._create_order_at_level(level=0, side='short') 72 | 73 | elif action == 5: 74 | action_penalty += self._create_order_at_level(level=4, side='long') 75 | action_penalty += self._create_order_at_level(level=4, side='short') 76 | 77 | elif action == 6: 78 | action_penalty += self._create_order_at_level(level=4, side='long') 79 | action_penalty += self._create_order_at_level(level=9, side='short') 80 | 81 | elif action == 7: 82 | action_penalty += self._create_order_at_level(level=4, side='long') 83 | action_penalty += self._create_order_at_level(level=14, side='short') 84 | 85 | elif action == 8: 86 | action_penalty += self._create_order_at_level(level=9, side='long') 87 | action_penalty += self._create_order_at_level(level=0, side='short') 88 | 89 | elif action == 9: 90 | action_penalty += self._create_order_at_level(level=9, side='long') 91 | action_penalty += self._create_order_at_level(level=4, side='short') 92 | 93 | elif action == 10: 94 | action_penalty += self._create_order_at_level(level=9, side='long') 95 | action_penalty += self._create_order_at_level(level=9, side='short') 96 | 97 | elif action == 11: 98 | action_penalty += self._create_order_at_level(level=9, side='long') 99 | action_penalty += self._create_order_at_level(level=14, side='short') 100 | 101 | elif action == 12: 102 | action_penalty += self._create_order_at_level(level=14, side='long') 103 | action_penalty += self._create_order_at_level(level=0, side='short') 104 | 105 | elif action == 13: 106 | action_penalty += self._create_order_at_level(level=14, side='long') 107 | action_penalty += self._create_order_at_level(level=4, side='short') 108 | 109 | elif action == 14: 110 | action_penalty += self._create_order_at_level(level=14, side='long') 111 | action_penalty += self._create_order_at_level(level=9, side='short') 112 | 113 | elif action == 15: 114 | action_penalty += self._create_order_at_level(level=14, side='long') 115 | action_penalty += self._create_order_at_level(level=14, side='short') 116 | 117 | elif action == 16: 118 | pnl += self.broker.flatten_inventory(self.best_bid, self.best_ask) 119 | 120 | else: 121 | raise ValueError("L'action n'exist pas !!! Il faut faire attention !!!") 122 | 123 | return action_penalty, pnl 124 | 125 | def _create_position_features(self) -> np.ndarray: 126 | """ 127 | Create an array with features related to the agent's inventory. 128 | 129 | :return: (np.array) normalized position features 130 | """ 131 | return np.array((self.broker.net_inventory_count / self.max_position, 132 | self.broker.realized_pnl * self.broker.pct_scale, 133 | self.broker.get_unrealized_pnl(self.best_bid, self.best_ask) 134 | * self.broker.pct_scale, 135 | self.broker.get_long_order_distance_to_midpoint( 136 | midpoint=self.midpoint) * self.broker.pct_scale, 137 | self.broker.get_short_order_distance_to_midpoint( 138 | midpoint=self.midpoint) * self.broker.pct_scale, 139 | *self.broker.get_queues_ahead_features()), dtype=np.float32) 140 | 141 | def _create_order_at_level(self, level: int, side: str) -> float: 142 | """ 143 | Create a new order at a specified LOB level. 144 | 145 | :param level: (int) level in the limit order book 146 | :param side: (str) direction of trade e.g., 'long' or 'short' 147 | :return: (float) reward with penalties added 148 | """ 149 | reward = 0.0 150 | if side == 'long': 151 | notional_index = self.notional_bid_index 152 | price_index = self.best_bid_index 153 | elif side == 'short': 154 | notional_index = self.notional_ask_index 155 | price_index = self.best_ask_index 156 | else: 157 | notional_index = price_index = None 158 | # get price data from numpy array 159 | price_level_price = self._get_book_data(index=price_index + level) 160 | # transform percentage into a hard number 161 | price_level_price = round(self.midpoint * (price_level_price + 1.), 2) 162 | price_level_queue = self._get_book_data(index=notional_index + level) 163 | # create a new order 164 | order = LimitOrder(ccy=self.symbol, 165 | side=side, 166 | price=price_level_price, 167 | step=self.local_step_number, 168 | queue_ahead=price_level_queue) 169 | # add a penalty or encouragement, depending if order is accepted 170 | if self.broker.add(order=order) is False: 171 | reward -= ENCOURAGEMENT 172 | else: 173 | reward += ENCOURAGEMENT 174 | return reward 175 | -------------------------------------------------------------------------------- /gym_trading/envs/trend_following.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | from gym import spaces 3 | from typing import Tuple 4 | 5 | from configurations import ENCOURAGEMENT, MARKET_ORDER_FEE 6 | from gym_trading.envs.base_environment import BaseEnvironment 7 | from gym_trading.utils.order import MarketOrder 8 | 9 | 10 | class TrendFollowing(BaseEnvironment): 11 | id = 'trend-following-v0' 12 | description = "Environment where agent can select market orders only" 13 | 14 | def __init__(self, **kwargs): 15 | """ 16 | Environment designed to trade price jumps using market orders. 17 | 18 | :param kwargs: refer to BaseEnvironment.py 19 | """ 20 | super().__init__(**kwargs) 21 | 22 | # Environment attributes to override in sub-class 23 | self.actions = np.eye(3, dtype=np.float32) 24 | 25 | self.action_space = spaces.Discrete(len(self.actions)) 26 | self.reset() # Reset to load observation.shape 27 | self.observation_space = spaces.Box(low=-10., high=10., 28 | shape=self.observation.shape, 29 | dtype=np.float32) 30 | 31 | # Add the remaining labels for the observation space 32 | self.viz.observation_labels += [f'Action #{a}' for a in range(len(self.actions))] 33 | self.viz.observation_labels += ['Reward'] 34 | 35 | print('{} {} #{} instantiated\nobservation_space: {}'.format( 36 | TrendFollowing.id, self.symbol, self._seed, self.observation_space.shape), 37 | 'reward_type = {}'.format(self.reward_type.upper()), 'max_steps = {}'.format( 38 | self.max_steps)) 39 | 40 | def __str__(self): 41 | return '{} | {}-{}'.format(TrendFollowing.id, self.symbol, self._seed) 42 | 43 | def map_action_to_broker(self, action: int) -> Tuple[float, float]: 44 | """ 45 | Create or adjust orders per a specified action and adjust for penalties. 46 | 47 | :param action: (int) current step's action 48 | :return: (float) reward 49 | """ 50 | action_penalty_reward = pnl = 0.0 51 | 52 | if action == 0: # do nothing 53 | action_penalty_reward += ENCOURAGEMENT 54 | 55 | elif action == 1: # buy 56 | # Deduct transaction costs 57 | if self.broker.transaction_fee: 58 | pnl -= MARKET_ORDER_FEE 59 | 60 | if self.broker.short_inventory_count > 0: 61 | # Net out existing position 62 | order = MarketOrder(ccy=self.symbol, side='short', price=self.best_ask, 63 | step=self.local_step_number) 64 | pnl += self.broker.remove(order=order) 65 | 66 | elif self.broker.long_inventory_count >= 0: 67 | order = MarketOrder(ccy=self.symbol, side='long', price=self.best_ask, 68 | step=self.local_step_number) 69 | if self.broker.add(order=order) is False: 70 | action_penalty_reward -= ENCOURAGEMENT 71 | 72 | else: 73 | raise ValueError(('gym_trading.get_reward() Error for action #{} - ' 74 | 'unable to place an order with broker').format(action)) 75 | 76 | elif action == 2: # sell 77 | # Deduct transaction costs 78 | if self.broker.transaction_fee: 79 | pnl -= MARKET_ORDER_FEE 80 | 81 | if self.broker.long_inventory_count > 0: 82 | # Net out existing position 83 | order = MarketOrder(ccy=self.symbol, side='long', price=self.best_bid, 84 | step=self.local_step_number) 85 | pnl += self.broker.remove(order=order) 86 | 87 | elif self.broker.short_inventory_count >= 0: 88 | order = MarketOrder(ccy=self.symbol, side='short', price=self.best_bid, 89 | step=self.local_step_number) 90 | if self.broker.add(order=order) is False: 91 | action_penalty_reward -= ENCOURAGEMENT 92 | 93 | else: 94 | raise ValueError(('gym_trading.get_reward() Error for action #{} - ' 95 | 'unable to place an order with broker').format(action)) 96 | 97 | else: 98 | raise ValueError(('Unknown action to take in get_reward(): ' 99 | 'action={} | midpoint={}').format(action, self.midpoint)) 100 | 101 | return action_penalty_reward, pnl 102 | 103 | def _create_position_features(self) -> np.ndarray: 104 | """ 105 | Create an array with features related to the agent's inventory. 106 | 107 | :return: (np.array) normalized position features 108 | """ 109 | return np.array((self.broker.net_inventory_count / self.max_position, 110 | self.broker.realized_pnl * self.broker.pct_scale, 111 | self.broker.get_unrealized_pnl(self.best_bid, self.best_ask) 112 | * self.broker.pct_scale), 113 | dtype=np.float32) 114 | -------------------------------------------------------------------------------- /gym_trading/tests/test_market_maker.py: -------------------------------------------------------------------------------- 1 | import unittest 2 | 3 | import gym 4 | 5 | import gym_trading 6 | from gym_trading.utils.decorator import print_time 7 | 8 | 9 | class MarketMakerTestCases(unittest.TestCase): 10 | 11 | @print_time 12 | def test_time_event_env(self): 13 | config = dict( 14 | id=gym_trading.envs.MarketMaker.id, 15 | symbol='LTC-USD', 16 | fitting_file='demo_LTC-USD_20190926.csv.xz', 17 | testing_file='demo_LTC-USD_20190926.csv.xz', 18 | max_position=10, 19 | window_size=5, 20 | seed=1, 21 | action_repeats=5, 22 | training=False, 23 | format_3d=True, 24 | reward_type='default', 25 | ema_alpha=None, 26 | ) 27 | print(f"**********\n{config}\n**********") 28 | 29 | env = gym.make(**config) 30 | done = gym_trading.envs.test_env_loop(env=env) 31 | _ = env.reset() 32 | self.assertEqual(True, done) 33 | 34 | 35 | if __name__ == '__main__': 36 | unittest.main() 37 | -------------------------------------------------------------------------------- /gym_trading/tests/test_trend_following.py: -------------------------------------------------------------------------------- 1 | import unittest 2 | 3 | import gym 4 | 5 | import gym_trading 6 | from gym_trading.utils.decorator import print_time 7 | 8 | 9 | class TrendFollowingTestCases(unittest.TestCase): 10 | 11 | @print_time 12 | def test_time_event_env(self): 13 | config = dict( 14 | id=gym_trading.envs.TrendFollowing.id, 15 | symbol='LTC-USD', 16 | fitting_file='demo_LTC-USD_20190926.csv.xz', 17 | testing_file='demo_LTC-USD_20190926.csv.xz', 18 | max_position=10, 19 | window_size=5, 20 | seed=1, 21 | action_repeats=5, 22 | training=False, 23 | format_3d=True, 24 | reward_type='default', 25 | ema_alpha=None, 26 | ) 27 | print(f"**********\n{config}\n**********") 28 | 29 | env = gym.make(**config) 30 | done = gym_trading.envs.test_env_loop(env=env) 31 | _ = env.reset() 32 | self.assertEqual(True, done) 33 | 34 | 35 | if __name__ == '__main__': 36 | unittest.main() 37 | -------------------------------------------------------------------------------- /gym_trading/utils/__init__.py: -------------------------------------------------------------------------------- 1 | from gym_trading.utils.broker import Broker 2 | from gym_trading.utils.data_pipeline import DataPipeline 3 | from gym_trading.utils.order import LimitOrder, MarketOrder 4 | from gym_trading.utils.plot_history import Visualize 5 | from gym_trading.utils.reward import ( 6 | asymmetrical, default, default_with_fills, 7 | differential_sharpe_ratio, realized_pnl, trade_completion, 8 | ) 9 | from gym_trading.utils.statistic import ExperimentStatistics, TradeStatistics 10 | -------------------------------------------------------------------------------- /gym_trading/utils/data_pipeline.py: -------------------------------------------------------------------------------- 1 | import os 2 | from datetime import datetime as dt 3 | 4 | import numpy as np 5 | import pandas as pd 6 | from sklearn.preprocessing import StandardScaler 7 | 8 | from configurations import DATA_PATH, EMA_ALPHA, LOGGER, MAX_BOOK_ROWS, TIMEZONE 9 | from indicators import apply_ema_all_data, load_ema, reset_ema 10 | 11 | 12 | class DataPipeline(object): 13 | 14 | def __init__(self, alpha: float or list or None = EMA_ALPHA): 15 | """ 16 | Data Pipeline constructor. 17 | """ 18 | self.alpha = alpha 19 | self.ema = load_ema(alpha=alpha) 20 | self._scaler = StandardScaler() 21 | 22 | def reset(self) -> None: 23 | """ 24 | Reset data pipeline. 25 | """ 26 | self._scaler = StandardScaler() 27 | self.ema = reset_ema(ema=self.ema) 28 | 29 | @staticmethod 30 | def import_csv(filename: str) -> pd.DataFrame: 31 | """ 32 | Import an historical tick file created from the export_to_csv() function. 33 | 34 | :param filename: Full file path including filename 35 | :return: (panda.DataFrame) historical limit order book data 36 | """ 37 | start_time = dt.now(tz=TIMEZONE) 38 | 39 | if 'xz' in filename: 40 | data = pd.read_csv(filepath_or_buffer=filename, index_col=0, 41 | compression='xz', engine='c') 42 | elif 'csv' in filename: 43 | data = pd.read_csv(filepath_or_buffer=filename, index_col=0, engine='c') 44 | else: 45 | LOGGER.warn('Error: file must be a csv or xz') 46 | data = None 47 | 48 | elapsed = (dt.now(tz=TIMEZONE) - start_time).seconds 49 | LOGGER.info('Imported %s from a csv in %i seconds' % (filename[-25:], elapsed)) 50 | return data 51 | 52 | def fit_scaler(self, orderbook_snapshot_history: pd.DataFrame) -> None: 53 | """ 54 | Scale limit order book data for the neural network. 55 | 56 | :param orderbook_snapshot_history: Limit order book data 57 | from the previous day 58 | :return: (void) 59 | """ 60 | self._scaler.fit(orderbook_snapshot_history) 61 | 62 | def scale_data(self, data: pd.DataFrame) -> np.ndarray: 63 | """ 64 | Standardize data. 65 | 66 | :param data: (np.array) all data in environment 67 | :return: (np.array) normalized observation space 68 | """ 69 | return self._scaler.transform(data) 70 | 71 | @staticmethod 72 | def _midpoint_diff(data: pd.DataFrame) -> pd.DataFrame: 73 | """ 74 | Take log difference of midpoint prices 75 | log(price t) - log(price t-1) 76 | 77 | :param data: (pd.DataFrame) raw data from LOB snapshots 78 | :return: (pd.DataFrame) with midpoint prices normalized 79 | """ 80 | data['midpoint'] = np.log(data['midpoint'].values) 81 | data['midpoint'] = (data['midpoint'] - data['midpoint'].shift(1) 82 | ).fillna(method='bfill') 83 | return data 84 | 85 | @staticmethod 86 | def _decompose_order_flow_information(data: pd.DataFrame) -> pd.DataFrame: 87 | """ 88 | Transform raw [market, limit, cancel] notional values into a single OFI. 89 | 90 | :param data: snapshot data imported from `self.import_csv` 91 | :return: LOB data with OFI 92 | """ 93 | # Derive column names for filtering OFI data 94 | event_columns = dict() 95 | for event_type in ['market_notional', 'limit_notional', 'cancel_notional']: 96 | event_columns[event_type] = [col for col in data.columns.tolist() if 97 | event_type in col] 98 | 99 | # Derive the number of rows that have been rendered in the LOB 100 | number_of_levels = len(event_columns['market_notional']) // 2 101 | 102 | # Calculate OFI = LIMIT - MARKET - CANCEL 103 | ofi_data = data[event_columns['limit_notional']].values - \ 104 | data[event_columns['market_notional']].values - \ 105 | data[event_columns['cancel_notional']].values 106 | 107 | # Convert numpy to DataFrame 108 | ofi_data = pd.DataFrame(data=ofi_data, 109 | columns=[f'ofi_bid_{i}' for i in range(number_of_levels)] + 110 | [f'ofi_ask_{i}' for i in range(number_of_levels)], 111 | index=data.index) 112 | 113 | # Merge with original data set 114 | data = pd.concat((data, ofi_data), axis=1) 115 | 116 | # Drop MARKET, LIMIT, and CANCEL columns from original data set 117 | for event_type in {'market_notional', 'limit_notional', 'cancel_notional'}: 118 | data = data.drop(event_columns[event_type], axis=1) 119 | 120 | return data 121 | 122 | @staticmethod 123 | def get_imbalance_labels() -> list: 124 | """ 125 | Get a list of column labels for notional order imbalances. 126 | """ 127 | imbalance_labels = [f'notional_imbalance_{row}' for row in range(MAX_BOOK_ROWS)] 128 | imbalance_labels += ['notional_imbalance_mean'] 129 | return imbalance_labels 130 | 131 | @staticmethod 132 | def _get_notional_imbalance(data: pd.DataFrame) -> pd.DataFrame: 133 | """ 134 | Calculate order imbalances per price level, their mean & standard deviation. 135 | 136 | Order Imbalances are calculated by: 137 | = (bid_quantity - ask_quantity) / (bid_quantity + ask_quantity) 138 | 139 | ...thus scale from [-1, 1]. 140 | 141 | :param data: raw/un-normalized LOB snapshot data 142 | :return: (pd.DataFrame) order imbalances at N-levels, the mean & std imbalance 143 | """ 144 | # Create the column names for making a data frame (also used for debugging) 145 | bid_notional_columns, ask_notional_columns, imbalance_columns = [], [], [] 146 | for i in range(MAX_BOOK_ROWS): 147 | bid_notional_columns.append(f'bids_notional_{i}') 148 | ask_notional_columns.append(f'asks_notional_{i}') 149 | imbalance_columns.append(f'notional_imbalance_{i}') 150 | # Acquire bid and ask notional data 151 | # Reverse the bids to ascending order, so that they align with the asks 152 | bid_notional = data[bid_notional_columns].to_numpy(dtype=np.float32) # [::-1] 153 | ask_notional = data[ask_notional_columns].to_numpy(dtype=np.float32) 154 | 155 | # Transform to cumulative imbalances 156 | bid_notional = np.cumsum(bid_notional, axis=1) 157 | ask_notional = np.cumsum(ask_notional, axis=1) 158 | 159 | # Calculate the order imbalance 160 | imbalances = ((bid_notional - ask_notional) + 1e-5) / \ 161 | ((bid_notional + ask_notional) + 1e-5) 162 | imbalances = pd.DataFrame(imbalances, columns=imbalance_columns, 163 | index=data.index).fillna(0.) 164 | # Add meta data to features (mean) 165 | imbalances['notional_imbalance_mean'] = imbalances[imbalance_columns].mean(axis=1) 166 | return imbalances 167 | 168 | def load_environment_data(self, fitting_file: str, testing_file: str, 169 | include_imbalances: bool = True, as_pandas: bool = False) \ 170 | -> (pd.DataFrame, pd.DataFrame, pd.DataFrame): 171 | """ 172 | Import and scale environment data set with prior day's data. 173 | 174 | Midpoint gets log-normalized: 175 | log(price t) - log(price t-1) 176 | 177 | :param fitting_file: prior trading day 178 | :param testing_file: current trading day 179 | :param include_imbalances: if TRUE, include LOB imbalances 180 | :param as_pandas: if TRUE, return data as DataFrame, otherwise np.array 181 | :return: (pd.DataFrame or np.array) scaled environment data 182 | """ 183 | # Import data used to fit scaler 184 | fitting_data_filepath = os.path.join(DATA_PATH, fitting_file) 185 | fitting_data = self.import_csv(filename=fitting_data_filepath) 186 | 187 | # Derive OFI statistics 188 | fitting_data = self._decompose_order_flow_information(data=fitting_data) 189 | # Take the log difference of midpoint prices 190 | fitting_data = self._midpoint_diff(data=fitting_data) # normalize midpoint 191 | # If applicable, smooth data set with EMA(s) 192 | fitting_data = apply_ema_all_data(ema=self.ema, data=fitting_data) 193 | # Fit the scaler 194 | self.fit_scaler(fitting_data) 195 | # Delete data from memory 196 | del fitting_data 197 | 198 | # Import data to normalize and use in environment 199 | data_used_in_environment = os.path.join(DATA_PATH, testing_file) 200 | data = self.import_csv(filename=data_used_in_environment) 201 | 202 | # Raw midpoint prices for back-testing environment 203 | midpoint_prices = data['midpoint'] 204 | 205 | # Copy of raw LOB snapshots for normalization 206 | normalized_data = self._midpoint_diff(data.copy(deep=True)) 207 | 208 | # Preserve the raw data and drop unnecessary columns 209 | data = data.drop([col for col in data.columns.tolist() 210 | if col in ['market', 'limit', 'cancel']], axis=1) 211 | 212 | # Derive OFI statistics 213 | normalized_data = self._decompose_order_flow_information(data=normalized_data) 214 | 215 | normalized_data = apply_ema_all_data(ema=self.ema, data=normalized_data) 216 | 217 | # Get column names for putting the numpy values into a data frame 218 | column_names = normalized_data.columns.tolist() 219 | # Scale data with fitting data set 220 | normalized_data = self.scale_data(normalized_data) 221 | # Remove outliers 222 | normalized_data = np.clip(normalized_data, -10., 10.) 223 | # Put data in a data frame 224 | normalized_data = pd.DataFrame(normalized_data, 225 | columns=column_names, 226 | index=midpoint_prices.index) 227 | 228 | if include_imbalances: 229 | LOGGER.info('Adding order imbalances...') 230 | # Note: since order imbalance data is scaled [-1, 1], we do not apply 231 | # z-score to the imbalance data 232 | imbalance_data = self._get_notional_imbalance(data=data) 233 | self.ema = reset_ema(self.ema) 234 | imbalance_data = apply_ema_all_data(ema=self.ema, data=imbalance_data) 235 | normalized_data = pd.concat((normalized_data, imbalance_data), axis=1) 236 | 237 | if as_pandas is False: 238 | midpoint_prices = midpoint_prices.to_numpy(dtype=np.float64) 239 | data = data.to_numpy(dtype=np.float32) 240 | normalized_data = normalized_data.to_numpy(dtype=np.float32) 241 | 242 | return midpoint_prices, data, normalized_data 243 | -------------------------------------------------------------------------------- /gym_trading/utils/decorator.py: -------------------------------------------------------------------------------- 1 | import time 2 | from functools import wraps 3 | 4 | 5 | def debugging(func): 6 | """ 7 | Decorator for debugging inputs into a function. 8 | :param func: function 9 | :return: function 10 | """ 11 | 12 | @wraps(func) 13 | def f(*args, **kwargs): 14 | if kwargs != {}: 15 | _kwargs = '\n'.join(["--{}\t=\t{}".format(k, v) for k, v in kwargs.items()]) 16 | else: 17 | _kwargs = 'None' 18 | print('-' * 50) 19 | print("{}(\nkwargs::\n {}\n)".format(func.__name__, _kwargs)) 20 | print('-' * 50) 21 | return func(*args, **kwargs) 22 | 23 | return f 24 | 25 | 26 | def print_time(func): 27 | """ 28 | Decorator for timing function execution duration. 29 | 30 | :param func: function 31 | :return: wrapped function 32 | """ 33 | 34 | @wraps(func) 35 | def f(*args, **kwargs): 36 | start_time = time.time() 37 | output = func(*args, **kwargs) 38 | elapsed = time.time() - start_time 39 | print("{} completed in {:.4f} seconds.".format(func.__name__, elapsed)) 40 | return output 41 | 42 | return f 43 | -------------------------------------------------------------------------------- /gym_trading/utils/order.py: -------------------------------------------------------------------------------- 1 | # order.py 2 | # 3 | # Market and Limit order implementations for {broker/position}.py 4 | # 5 | # 6 | from abc import ABC 7 | 8 | from configurations import LIMIT_ORDER_FEE, LOGGER, MARKET_ORDER_FEE 9 | 10 | 11 | class OrderMetrics(object): 12 | 13 | def __init__(self): 14 | """ 15 | Class for capturing order / position metrics 16 | """ 17 | self.drawdown_max = 0.0 18 | self.upside_max = 0.0 19 | self.steps_in_position = 0 20 | 21 | def __str__(self): 22 | return ('OrderMetrics: [ drawdown_max={} | upside_max={} | ' 23 | 'steps_in_position={} ]').format(self.drawdown_max, self.upside_max, 24 | self.steps_in_position) 25 | 26 | 27 | class Order(ABC): 28 | DEFAULT_SIZE = 1000. 29 | _id = 0 30 | LIMIT_ORDER_FEE = LIMIT_ORDER_FEE * 2 31 | MARKET_ORDER_FEE = MARKET_ORDER_FEE * 2 32 | 33 | def __init__(self, price: float, step: int, average_execution_price: float, 34 | order_type='limit', ccy='BTC-USD', side='long', ): 35 | """ 36 | 37 | :param price: 38 | :param step: 39 | :param average_execution_price: 40 | :param order_type: 41 | :param ccy: 42 | :param side: 43 | """ 44 | self.order_type = order_type 45 | self.ccy = ccy 46 | self.side = side 47 | self.price = price 48 | self.step = step 49 | self.average_execution_price = average_execution_price 50 | self.metrics = OrderMetrics() 51 | self.executed = 0. 52 | self.queue_ahead = 0. 53 | self.executions = dict() 54 | Order._id += 1 55 | self.id = Order._id 56 | 57 | def __str__(self): 58 | return ' {} #{} | {} | {:.3f} | {} | {} | {}'.format( 59 | self.ccy, self.id, self.side, self.price, self.step, self.metrics, 60 | self.queue_ahead) 61 | 62 | @property 63 | def is_filled(self) -> bool: 64 | """ 65 | If TRUE, the entire order has been executed. 66 | 67 | :return: (bool) TRUE if the order is completely filled 68 | """ 69 | return self.executed >= Order.DEFAULT_SIZE 70 | 71 | def update_metrics(self, price: float, step: int) -> None: 72 | """ 73 | Update specific position metrics per each order. 74 | 75 | :param price: (float) current midpoint price 76 | :param step: (int) current time step 77 | :return: (void) 78 | """ 79 | self.metrics.steps_in_position = step - self.step 80 | if self.is_filled: 81 | if self.side == 'long': 82 | unrealized_pnl = (price - self.average_execution_price) / \ 83 | self.average_execution_price 84 | elif self.side == 'short': 85 | unrealized_pnl = (self.average_execution_price - price) / \ 86 | self.average_execution_price 87 | else: 88 | unrealized_pnl = 0.0 89 | LOGGER.warning('alert: unknown order.step() side %s' % self.side) 90 | 91 | if unrealized_pnl < self.metrics.drawdown_max: 92 | self.metrics.drawdown_max = unrealized_pnl 93 | 94 | if unrealized_pnl > self.metrics.upside_max: 95 | self.metrics.upside_max = unrealized_pnl 96 | 97 | 98 | class MarketOrder(Order): 99 | def __init__(self, ccy='BTC-USD', side='long', price=0.0, step=-1): 100 | super(MarketOrder, self).__init__(price=price, 101 | step=step, 102 | average_execution_price=-1, 103 | order_type='market', 104 | ccy=ccy, 105 | side=side) 106 | 107 | def __str__(self): 108 | return "[MarketOrder] " + super(MarketOrder, self).__str__() 109 | 110 | 111 | class LimitOrder(Order): 112 | 113 | def __init__(self, ccy='BTC-USD', side='long', price=0.0, step=-1, queue_ahead=100.): 114 | super(LimitOrder, self).__init__(price=price, 115 | step=step, 116 | average_execution_price=-1., 117 | order_type='limit', 118 | ccy=ccy, 119 | side=side) 120 | self.queue_ahead = queue_ahead 121 | # print('LimitOrder_{}: [price={} | side={} | step={} | queue={}]'.format( 122 | # self.ccy, self.price, self.side, self.step, self.queue_ahead 123 | # )) 124 | 125 | def __str__(self): 126 | return "[LimitOrder] " + super(LimitOrder, self).__str__() 127 | 128 | def reduce_queue_ahead(self, executed_volume=100.) -> None: 129 | """ 130 | Subtract transactions from the queue ahead of the agent's open order in the 131 | LOB. This attribute is used to inform the agent how much notional volume is 132 | ahead of it's open order. 133 | 134 | :param executed_volume: (float) notional volume of recent transaction 135 | :return: (void) 136 | """ 137 | self.queue_ahead -= executed_volume 138 | if self.queue_ahead < 0.: 139 | splash = 0. - self.queue_ahead 140 | self.queue_ahead = 0. 141 | self.process_executions(volume=splash) 142 | 143 | def process_executions(self, volume=100.) -> None: 144 | """ 145 | Subtract transactions from the agent's open order (e.g., partial fills). 146 | 147 | :param volume: (float) notional volume of recent transaction 148 | :return: (void) 149 | """ 150 | self.executed += volume 151 | overflow = 0. 152 | if self.is_filled: 153 | overflow = self.executed - Order.DEFAULT_SIZE 154 | self.executed -= overflow 155 | 156 | _price = float(self.price) 157 | if _price in self.executions: 158 | self.executions[_price] += volume - overflow 159 | else: 160 | self.executions[_price] = volume - overflow 161 | 162 | def get_average_execution_price(self) -> float: 163 | """ 164 | Average execution price of an order. 165 | 166 | Note: agents can update a given order many times, thus a single order can have 167 | partial fills at many different prices. 168 | 169 | :return: (float) average execution price 170 | """ 171 | self.average_execution_price = sum( 172 | [notional_volume * price for price, notional_volume in 173 | self.executions.items()]) / self.DEFAULT_SIZE 174 | return round(self.average_execution_price, 2) 175 | 176 | @property 177 | def is_first_in_queue(self) -> bool: 178 | """ 179 | Determine if current order is first in line to be executed. 180 | 181 | :return: True if the order is the first in the queue 182 | """ 183 | return self.queue_ahead <= 0. 184 | -------------------------------------------------------------------------------- /gym_trading/utils/plot_history.py: -------------------------------------------------------------------------------- 1 | import matplotlib.cm as cm 2 | import matplotlib.pyplot as plt 3 | import numpy as np 4 | import pandas as pd 5 | 6 | SMALL_SIZE = 12 7 | MEDIUM_SIZE = 14 8 | BIGGER_SIZE = 16 9 | plt.rc('font', size=SMALL_SIZE) # controls default text sizes 10 | plt.rc('axes', titlesize=SMALL_SIZE) # font size of the axes title 11 | plt.rc('axes', labelsize=MEDIUM_SIZE) # font size of the x and y labels 12 | plt.rc('xtick', labelsize=SMALL_SIZE) # font size of the tick labels 13 | plt.rc('ytick', labelsize=SMALL_SIZE) # font size of the tick labels 14 | plt.rc('legend', fontsize=SMALL_SIZE) # legend font size 15 | plt.rc('figure', titlesize=BIGGER_SIZE) # font size of the figure title 16 | 17 | 18 | def plot_observation_space(observation: np.ndarray, 19 | labels: str, 20 | save_filename: str or None = None) -> None: 21 | """ 22 | Represent all the observation spaces seen by the agent as one image. 23 | """ 24 | fig, ax = plt.subplots(figsize=(16, 10)) 25 | im = ax.imshow(observation, 26 | interpolation='none', 27 | cmap=cm.get_cmap('seismic'), 28 | origin='lower', 29 | aspect='auto', 30 | vmax=observation.max(), 31 | vmin=observation.min()) 32 | plt.xticks(range(len(labels)), labels, rotation='vertical') 33 | plt.tight_layout() 34 | 35 | if save_filename is None: 36 | plt.show() 37 | else: 38 | plt.savefig(f"{save_filename}_OBS.png") 39 | plt.close(fig) 40 | 41 | 42 | class Visualize(object): 43 | 44 | def __init__(self, 45 | columns: list or None, 46 | store_historical_observations: bool = True): 47 | """ 48 | Helper class to store episode performance. 49 | 50 | :param columns: Column names (or labels) for rending data 51 | :param store_historical_observations: if TRUE, store observation 52 | space for rendering as an image at the end of an episode 53 | """ 54 | self._data = list() 55 | self._columns = columns 56 | 57 | # Observation space for rendering 58 | self._store_historical_observations = store_historical_observations 59 | self._historical_observations = list() 60 | self.observation_labels = None 61 | 62 | def add_observation(self, obs: np.ndarray) -> None: 63 | """ 64 | Append current time step of observation to list for rendering 65 | observation space at the end of an episode. 66 | 67 | :param obs: Current time step observation from the environment 68 | """ 69 | if self._store_historical_observations: 70 | self._historical_observations.append(obs) 71 | 72 | def add(self, *args): 73 | """ 74 | Add time step to visualizer. 75 | 76 | :param args: midpoint, buy trades, sell trades 77 | :return: 78 | """ 79 | self._data.append(args) 80 | 81 | def to_df(self) -> pd.DataFrame: 82 | """ 83 | Get episode history of prices and agent transactions in the form of a DataFrame. 84 | 85 | :return: DataFrame with episode history of prices and agent transactions 86 | """ 87 | return pd.DataFrame(data=self._data, columns=self._columns) 88 | 89 | def reset(self) -> None: 90 | """ 91 | Reset data for new episode. 92 | """ 93 | self._data.clear() 94 | self._historical_observations.clear() 95 | 96 | def plot_episode_history(self, history: pd.DataFrame or None = None, 97 | save_filename: str or None = None) -> None: 98 | """ 99 | Plot this entire history of an episode including: 100 | 1) Midpoint prices with trade executions 101 | 2) Inventory count at every step 102 | 3) Realized PnL at every step 103 | 104 | :param history: data from past episode 105 | :param save_filename: Filename to save image as 106 | """ 107 | if isinstance(history, pd.DataFrame): 108 | data = history 109 | else: 110 | data = self.to_df() 111 | 112 | midpoints = data['midpoint'].values 113 | long_fills = data.loc[data['buys'] > 0., 'buys'].index.values 114 | short_fills = data.loc[data['sells'] > 0., 'sells'].index.values 115 | inventory = data['inventory'].values 116 | pnl = data['realized_pnl'].values 117 | 118 | heights = [6, 2, 2] 119 | widths = [14] 120 | gs_kw = dict(width_ratios=widths, height_ratios=heights) 121 | fig, axs = plt.subplots(nrows=len(heights), ncols=len(widths), 122 | sharex=True, 123 | figsize=(widths[0], int(sum(heights))), 124 | gridspec_kw=gs_kw) 125 | 126 | axs[0].plot(midpoints, label='midpoints', color='blue', alpha=0.6) 127 | axs[0].set_ylabel('Midpoint Price (USD)', color='black') 128 | 129 | # Redundant labeling for all computer compatibility 130 | axs[0].set_facecolor("w") 131 | axs[0].tick_params(axis='x', colors='black') 132 | axs[0].tick_params(axis='y', colors='black') 133 | axs[0].spines['top'].set_visible(True) 134 | axs[0].spines['right'].set_visible(True) 135 | axs[0].spines['bottom'].set_visible(True) 136 | axs[0].spines['left'].set_visible(True) 137 | axs[0].spines['top'].set_color("black") 138 | axs[0].spines['right'].set_color("black") 139 | axs[0].spines['bottom'].set_color("black") 140 | axs[0].spines['left'].set_color("black") 141 | axs[0].grid(color='grey', linestyle='-', linewidth=0.25, alpha=0.5) 142 | 143 | axs[0].scatter(x=long_fills, y=midpoints[long_fills], label='buys', alpha=0.7, 144 | color='green', marker="^") 145 | 146 | axs[0].scatter(x=short_fills, y=midpoints[short_fills], label='sells', alpha=0.7, 147 | color='red', marker="v") 148 | 149 | axs[1].plot(inventory, label='inventory', color='orange') 150 | axs[1].axhline(0., color='grey') 151 | axs[1].set_ylabel('Inventory Count', color='black') 152 | axs[1].set_facecolor("w") 153 | axs[1].tick_params(axis='x', colors='black') 154 | axs[1].tick_params(axis='y', colors='black') 155 | axs[1].spines['top'].set_visible(True) 156 | axs[1].spines['right'].set_visible(True) 157 | axs[1].spines['bottom'].set_visible(True) 158 | axs[1].spines['left'].set_visible(True) 159 | axs[1].spines['top'].set_color("black") 160 | axs[1].spines['right'].set_color("black") 161 | axs[1].spines['bottom'].set_color("black") 162 | axs[1].spines['left'].set_color("black") 163 | axs[1].grid(color='grey', linestyle='-', linewidth=0.25, alpha=0.5) 164 | 165 | axs[2].plot(pnl, label='Realized PnL', color='purple') 166 | axs[2].axhline(0., color='grey') 167 | axs[2].set_ylabel("PnL (%)", color='black') 168 | axs[2].set_xlabel('Number of steps (1 second each step)', color='black') 169 | # Redundant labeling for all computer compatibility 170 | axs[2].set_facecolor("w") 171 | axs[2].tick_params(axis='x', colors='black') 172 | axs[2].tick_params(axis='y', colors='black') 173 | axs[2].spines['top'].set_visible(True) 174 | axs[2].spines['right'].set_visible(True) 175 | axs[2].spines['bottom'].set_visible(True) 176 | axs[2].spines['left'].set_visible(True) 177 | axs[2].spines['top'].set_color("black") 178 | axs[2].spines['right'].set_color("black") 179 | axs[2].spines['bottom'].set_color("black") 180 | axs[2].spines['left'].set_color("black") 181 | axs[2].grid(color='grey', linestyle='-', linewidth=0.25, alpha=0.5) 182 | plt.tight_layout() 183 | 184 | if save_filename is None: 185 | plt.show() 186 | else: 187 | plt.savefig(f"{save_filename}.png") 188 | plt.close(fig) 189 | 190 | def plot_obs(self, save_filename: str or None = None) -> None: 191 | """ 192 | Represent all the observation spaces seen by the agent as one image. 193 | """ 194 | observations = np.asarray(self._historical_observations, dtype=np.float32) 195 | plot_observation_space(observation=observations, 196 | labels=self.observation_labels, 197 | save_filename=save_filename) 198 | -------------------------------------------------------------------------------- /gym_trading/utils/render_env.py: -------------------------------------------------------------------------------- 1 | import matplotlib.pyplot as plt 2 | import numpy as np 3 | 4 | 5 | class TradingGraph: 6 | """ 7 | A stock trading visualization using matplotlib 8 | made to render OpenAI gym environments 9 | """ 10 | plt.style.use('dark_background') 11 | 12 | def __init__(self, sym=None): 13 | # attributes for rendering 14 | self.sym = sym 15 | self.line1 = [] 16 | self.screen_size = 1000 17 | self.y_vec = None 18 | self.x_vec = np.linspace(0, self.screen_size * 10, 19 | self.screen_size + 1)[0:-1] 20 | 21 | def reset_render_data(self, y_vec): 22 | self.y_vec = y_vec 23 | self.line1 = [] 24 | 25 | def render(self, midpoint=100., mode='human'): 26 | if mode == 'human': 27 | self.line1 = self.live_plotter(self.x_vec, 28 | self.y_vec, 29 | self.line1, 30 | identifier=self.sym) 31 | self.y_vec = np.append(self.y_vec[1:], midpoint) 32 | 33 | @staticmethod 34 | def live_plotter(x_vec, y1_data, line1, identifier='Add Symbol Name', 35 | pause_time=0.00001): 36 | if not line1: 37 | # this is the call to matplotlib that allows dynamic plotting 38 | plt.ion() 39 | fig = plt.figure(figsize=(20, 12)) 40 | ax = fig.add_subplot(111) 41 | # create a variable for the line so we can later update it 42 | line1, = ax.plot(x_vec, y1_data, '-', label='midpoint', alpha=0.8) 43 | # update plot label/title 44 | plt.ylabel('Price') 45 | plt.legend() 46 | plt.title('Title: {}'.format(identifier)) 47 | plt.show(block=False) 48 | 49 | # after the figure, axis, and line are created, we only need to update the 50 | # y-data 51 | line1.set_ydata(y1_data) 52 | 53 | # adjust limits if new data goes beyond bounds 54 | if np.min(y1_data) <= line1.axes.get_ylim()[0] or \ 55 | np.max(y1_data) >= line1.axes.get_ylim()[1]: 56 | plt.ylim(np.min(y1_data), np.max(y1_data)) 57 | 58 | # this pauses the data so the figure/axis can catch up 59 | # - the amount of pause can be altered above 60 | plt.pause(pause_time) 61 | 62 | # return line so we can update it again in the next iteration 63 | return line1 64 | 65 | @staticmethod 66 | def close(): 67 | plt.close() 68 | -------------------------------------------------------------------------------- /gym_trading/utils/reward.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | 4 | def default(inventory_count: int, midpoint_change: float) -> float: 5 | """ 6 | Default reward type for environments, which is derived from PnL and order quantity. 7 | 8 | The inputs are as follows: 9 | (1) Change in exposure value between time steps, in dollar terms; and, 10 | (2) Realized PnL from a open order being filled between time steps, 11 | in dollar terms. 12 | 13 | :param inventory_count: TRUE if long order is filled within same time step 14 | :param midpoint_change: percentage change in midpoint price 15 | :return: reward 16 | """ 17 | reward = inventory_count * midpoint_change 18 | return reward 19 | 20 | 21 | def default_with_fills(inventory_count: int, midpoint_change: float, step_pnl: float) -> float: 22 | """ 23 | Same as Default reward type for environments, but includes PnL from closing positions. 24 | 25 | The inputs are as follows: 26 | (1) Change in exposure value between time steps, in dollar terms; and, 27 | (2) Realized PnL from a open order being filled between time steps, 28 | in dollar terms. 29 | 30 | :param inventory_count: TRUE if long order is filled within same time step 31 | :param midpoint_change: percentage change in midpoint price 32 | :param step_pnl: limit order pnl 33 | :return: reward 34 | """ 35 | reward = (inventory_count * midpoint_change) + step_pnl 36 | return reward 37 | 38 | 39 | def realized_pnl(current_pnl: float, last_pnl: float) -> float: 40 | """ 41 | Only provide reward signal when a trade is closed (round-trip). 42 | 43 | :param current_pnl: Realized PnL at current time step 44 | :param last_pnl: Realized PnL at former time step 45 | :return: reward 46 | """ 47 | reward = current_pnl - last_pnl 48 | return reward 49 | 50 | 51 | def differential_sharpe_ratio(R_t: float, A_tm1: float, B_tm1: float, 52 | eta: float = 0.01) -> (float, float, float): 53 | """ 54 | Method to calculate Differential Sharpe Ratio online. 55 | 56 | Source 1: http://www.cs.cmu.edu/afs/cs/project/link-3/lafferty/www/ml-stat-www/moody.pdf 57 | Source 2: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.87.8437&rep=rep1&type 58 | =pdf 59 | 60 | :param R_t: reward from current time step (midpoint price change a.k.a. 'price returns') 61 | :param A_tm1: A from previous time step 62 | :param B_tm1: B form previous time step 63 | :param eta: discount rate (same as EMA's alpha) 64 | :return: (tuple) reward, A_t, and B_t 65 | """ 66 | if R_t == 0.: 67 | return 0., A_tm1, B_tm1 68 | 69 | reward = 0. 70 | 71 | A_delta = R_t - A_tm1 72 | B_delta = R_t ** 2 - B_tm1 73 | 74 | A_t = A_tm1 + eta * A_delta 75 | B_t = B_tm1 + eta * B_delta 76 | 77 | nominator = B_tm1 * A_delta - (0.5 * A_tm1 * B_delta) 78 | denominator = (B_tm1 - A_tm1 ** 2) ** 1.5 79 | 80 | if np.isnan(nominator): 81 | return reward, A_t, B_t 82 | elif nominator == 0.: 83 | return reward, A_t, B_t 84 | elif denominator == 0.: 85 | return reward, A_t, B_t 86 | 87 | # scale down the feedback signal by 1/100th to avoid large spikes 88 | reward = (nominator / denominator) * 0.01 89 | 90 | return reward, A_t, B_t 91 | 92 | 93 | def asymmetrical(inventory_count: int, midpoint_change: float, half_spread_pct: float, 94 | long_filled: bool, short_filled: bool, step_pnl: float, 95 | dampening: float = 0.6) -> float: 96 | """ 97 | Asymmetrical reward type for environments, which is derived from percentage 98 | changes and notional values. 99 | 100 | The inputs are as follows: 101 | (1) Change in exposure value between time steps, in percentage terms; and, 102 | (2) Realized PnL from a open order being filled between time steps, 103 | in percentage. 104 | 105 | :param inventory_count: Number of open positions 106 | :param midpoint_change: Percentage change of midpoint between steps 107 | :param half_spread_pct: Percentage distance from bid/ask to midpoint 108 | :param long_filled: TRUE if long order is filled within same time step 109 | :param short_filled: TRUE if short order is filled within same time step 110 | :param step_pnl: limit order pnl and any penalties for bad actions 111 | :param dampening: discount factor towards pnl change between time steps 112 | :return: (float) reward 113 | """ 114 | exposure_change = inventory_count * midpoint_change 115 | fill_reward = 0. 116 | 117 | if long_filled: 118 | fill_reward += half_spread_pct 119 | if short_filled: 120 | fill_reward += half_spread_pct 121 | 122 | reward = fill_reward + min(0., exposure_change * dampening) 123 | 124 | if long_filled or short_filled: 125 | reward += step_pnl 126 | 127 | return reward 128 | 129 | 130 | def trade_completion(step_pnl: float, market_order_fee: float, 131 | profit_ratio: float = 2.) -> float: 132 | """ 133 | Alternate approach for reward calculation which places greater importance on 134 | trades that have returned at least a 1:1 profit-to-loss ratio after 135 | transaction fees. 136 | 137 | :param step_pnl: limit order pnl and any penalties for bad actions 138 | :param market_order_fee: transaction fee for market orders 139 | :param profit_ratio: minimum profit-to-risk ratio to earn '1' point (e,g., 2x) 140 | :return: reward 141 | """ 142 | reward = 0.0 143 | 144 | if step_pnl > market_order_fee * profit_ratio: # e.g., 2:1 profit to loss ratio 145 | reward += 1.0 146 | elif step_pnl > 0.0: # Not a 2:1 PL ratio, but still a positive return 147 | reward += step_pnl 148 | elif step_pnl < -market_order_fee: # Loss is more than the transaction fee 149 | reward -= 1.0 150 | else: # Loss is less than the transaction fee and negative 151 | reward += step_pnl 152 | 153 | return reward 154 | -------------------------------------------------------------------------------- /gym_trading/utils/statistic.py: -------------------------------------------------------------------------------- 1 | class TradeStatistics(object): 2 | """ 3 | Class for storing order metrics performed by the agent. 4 | """ 5 | 6 | def __init__(self): 7 | """ 8 | Instantiate the class. 9 | """ 10 | self.orders_executed = 0 11 | self.orders_placed = 0 12 | self.orders_updated = 0 13 | self.market_orders = 0 14 | 15 | def __str__(self): 16 | return ('TradeStatistics:\n' 17 | 'orders_executed = \t{}\n' 18 | 'orders_placed = \t{}\n' 19 | 'orders_updated = \t{}\n' 20 | 'market_orders = \t{}').format( 21 | self.orders_executed, 22 | self.orders_placed, 23 | self.orders_updated, 24 | self.market_orders 25 | ) 26 | 27 | def reset(self) -> None: 28 | """ 29 | Reset all trackers. 30 | 31 | :return: (void) 32 | """ 33 | self.orders_executed = 0 34 | self.orders_placed = 0 35 | self.orders_updated = 0 36 | self.market_orders = 0 37 | 38 | 39 | class ExperimentStatistics(object): 40 | 41 | def __init__(self): 42 | """ 43 | Instantiate the class. 44 | """ 45 | self.reward = 0. 46 | self.number_of_episodes = 0 47 | 48 | def __str__(self): 49 | return 'ExperimentStatistics:\nreward\t=\t{:.4f}'.format(self.reward) + \ 50 | '\nNumber of Episodes\t=\t{:.4f}'.format(self.number_of_episodes) 51 | 52 | def reset(self) -> None: 53 | """ 54 | Reset all trackers. 55 | 56 | :return: (void) 57 | """ 58 | self.reward = 0. 59 | -------------------------------------------------------------------------------- /indicators/__init__.py: -------------------------------------------------------------------------------- 1 | from indicators.ema import ExponentialMovingAverage, apply_ema_all_data, load_ema, reset_ema 2 | from indicators.indicator import IndicatorManager 3 | from indicators.rsi import RSI 4 | from indicators.tns import TnS 5 | -------------------------------------------------------------------------------- /indicators/ema.py: -------------------------------------------------------------------------------- 1 | from typing import List, Union 2 | 3 | import numpy as np 4 | import pandas as pd 5 | 6 | from configurations import LOGGER 7 | 8 | 9 | class ExponentialMovingAverage(object): 10 | __slots__ = ['alpha', '_value'] 11 | 12 | def __init__(self, alpha: float): 13 | """ 14 | Calculate Exponential moving average in O(1) time. 15 | 16 | :param alpha: decay factor, usually between 0.9 and 0.9999 17 | """ 18 | self.alpha = alpha 19 | self._value = None 20 | 21 | def __str__(self): 22 | return f'ExponentialMovingAverage: [ alpha={self.alpha} | value={self._value} ]' 23 | 24 | def step(self, value: float) -> None: 25 | """ 26 | Update EMA at every time step. 27 | 28 | :param value: price at current time step 29 | :return: (void) 30 | """ 31 | if self._value is None: 32 | self._value = value 33 | return 34 | 35 | self._value = (1. - self.alpha) * value + self.alpha * self._value 36 | 37 | @property 38 | def value(self) -> float: 39 | """ 40 | EMA value of data. 41 | 42 | :return: (float) EMA smoothed value 43 | """ 44 | return self._value 45 | 46 | def reset(self) -> None: 47 | """ 48 | Reset EMA. 49 | 50 | :return: (void) 51 | """ 52 | self._value = None 53 | 54 | 55 | def load_ema(alpha: Union[List[float], float, None]) -> \ 56 | Union[List[ExponentialMovingAverage], ExponentialMovingAverage, None]: 57 | """ 58 | Set exponential moving average smoother. 59 | 60 | :param alpha: decay rate for EMA 61 | :return: (var) EMA 62 | """ 63 | if alpha is None: 64 | # print("EMA smoothing DISABLED") 65 | return None 66 | elif isinstance(alpha, float): 67 | LOGGER.info(f"EMA smoothing ENABLED: {alpha}") 68 | return ExponentialMovingAverage(alpha=alpha) 69 | elif isinstance(alpha, list): 70 | LOGGER.info(f"EMA smoothing ENABLED: {alpha}") 71 | return [ExponentialMovingAverage(alpha=a) for a in alpha] 72 | else: 73 | raise ValueError(f"_load_ema() --> unknown alpha type: {type(alpha)}") 74 | 75 | 76 | def apply_ema_all_data( 77 | ema: Union[List[ExponentialMovingAverage], ExponentialMovingAverage, None], 78 | data: pd.DataFrame) -> pd.DataFrame: 79 | """ 80 | Apply exponential moving average to entire data set in a single batch. 81 | 82 | :param ema: EMA handler; if None, no EMA is applied 83 | :param data: data set to smooth 84 | :return: (np.array) smoothed data set, if ema is provided 85 | """ 86 | if ema is None: 87 | return data 88 | 89 | smoothed_data = [] 90 | labels = data.columns.tolist() 91 | 92 | if isinstance(ema, ExponentialMovingAverage): 93 | LOGGER.info("Applying EMA to data...") 94 | for row in data.values: 95 | ema.step(value=row) 96 | smoothed_data.append(ema.value) 97 | smoothed_data = np.asarray(smoothed_data, dtype=np.float32) 98 | return pd.DataFrame(smoothed_data, columns=labels, index=data.index) 99 | elif isinstance(ema, list): 100 | LOGGER.info("Applying list of EMAs to data...") 101 | labels = [f'{label}_{e.alpha}' for e in ema for label in labels] 102 | for row in data.values: 103 | tmp_row = [] 104 | for e in ema: 105 | e.step(value=row) 106 | tmp_row.append(e.value) 107 | smoothed_data.append(tmp_row) 108 | smoothed_data = np.asarray(smoothed_data, dtype=np.float32).reshape( 109 | data.shape[0], -1) 110 | return pd.DataFrame(smoothed_data, columns=labels, index=data.index) 111 | else: 112 | raise ValueError(f"_apply_ema() --> unknown ema type: {type(ema)}") 113 | 114 | 115 | def reset_ema(ema: Union[List[ExponentialMovingAverage], ExponentialMovingAverage, None]) -> \ 116 | Union[List[ExponentialMovingAverage], ExponentialMovingAverage, None]: 117 | """ 118 | Reset the EMA smoother. 119 | 120 | :param ema: 121 | :return: 122 | """ 123 | if ema is None: 124 | pass 125 | elif isinstance(ema, ExponentialMovingAverage): 126 | ema.reset() 127 | LOGGER.info("Reset EMA data.") 128 | elif isinstance(ema, list): 129 | for e in ema: 130 | e.reset() 131 | LOGGER.info("Reset EMA data.") 132 | return ema 133 | -------------------------------------------------------------------------------- /indicators/indicator.py: -------------------------------------------------------------------------------- 1 | from abc import ABC, abstractmethod 2 | from collections import deque 3 | from typing import List, Tuple, Union 4 | 5 | from configurations import INDICATOR_WINDOW 6 | from indicators.ema import ExponentialMovingAverage, load_ema 7 | 8 | 9 | class Indicator(ABC): 10 | 11 | def __init__(self, label: str, 12 | window: Union[int, None] = INDICATOR_WINDOW[0], 13 | alpha: Union[List[float], float, None] = None): 14 | """ 15 | Indicator constructor. 16 | 17 | :param window: (int) rolling window used for indicators 18 | :param alpha: (float) decay rate for EMA; if NONE, raw values returned 19 | """ 20 | self._label = f"{label}_{window}" 21 | self.window = window 22 | if self.window is not None: 23 | self.all_history_queue = deque(maxlen=self.window + 1) 24 | else: 25 | self.all_history_queue = deque(maxlen=2) 26 | self.ema = load_ema(alpha=alpha) 27 | self._value = 0. 28 | 29 | def __str__(self): 30 | return f'Indicator.base() [ window={self.window}, ' \ 31 | f'all_history_queue={self.all_history_queue}, ema={self.ema} ]' 32 | 33 | @abstractmethod 34 | def reset(self) -> None: 35 | """ 36 | Clear values in indicator cache. 37 | 38 | :return: (void) 39 | """ 40 | self._value = 0. 41 | self.all_history_queue.clear() 42 | 43 | @abstractmethod 44 | def step(self, **kwargs) -> None: 45 | """ 46 | Update indicator with steps from the environment. 47 | 48 | :param kwargs: data values passed to indicators 49 | :return: (void) 50 | """ 51 | if self.ema is None: 52 | pass 53 | elif isinstance(self.ema, ExponentialMovingAverage): 54 | self.ema.step(**kwargs) 55 | elif isinstance(self.ema, list): 56 | for ema in self.ema: 57 | ema.step(**kwargs) 58 | else: 59 | pass 60 | 61 | @abstractmethod 62 | def calculate(self, *args, **kwargs) -> float: 63 | """ 64 | Calculate indicator value. 65 | 66 | :return: (float) value of indicator 67 | """ 68 | pass 69 | 70 | @property 71 | def value(self) -> Union[List[float], float]: 72 | """ 73 | Get indicator value for the current time step. 74 | 75 | :return: (scalar float) 76 | """ 77 | if self.ema is None: 78 | return self._value 79 | elif isinstance(self.ema, ExponentialMovingAverage): 80 | return self.ema.value 81 | elif isinstance(self.ema, list): 82 | return [ema.value for ema in self.ema] 83 | else: 84 | return 0. 85 | 86 | @property 87 | def label(self) -> Union[List[str], str]: 88 | """ 89 | Get indicator value for the current time step. 90 | 91 | :return: (scalar float) 92 | """ 93 | if self.ema is None: 94 | return self._label 95 | elif isinstance(self.ema, ExponentialMovingAverage): 96 | return f"{self._label}_{self.ema.alpha}" 97 | elif isinstance(self.ema, list): 98 | return [f"{self._label}_{ema.alpha}" for ema in self.ema] 99 | else: 100 | raise ValueError(f"Error: EMA provided not valid --> {self.ema}") 101 | 102 | @property 103 | def raw_value(self) -> float: 104 | """ 105 | Guaranteed raw value, if EMA is enabled. 106 | 107 | :return: (float) raw indicator value 108 | """ 109 | return self._value 110 | 111 | @staticmethod 112 | def safe_divide(nom: float, denom: float) -> float: 113 | """ 114 | Safely perform divisions without throwing an 'divide by zero' exception. 115 | 116 | :param nom: nominator 117 | :param denom: denominator 118 | :return: value 119 | """ 120 | if denom == 0.: 121 | return 0. 122 | elif nom == 0.: 123 | return 0. 124 | else: 125 | return nom / denom 126 | 127 | 128 | class IndicatorManager(object): 129 | __slots__ = ['indicators'] 130 | 131 | def __init__(self): 132 | """ 133 | Wrapper class to manage multiple indicators at the same time 134 | (e.g., window size stacking) 135 | 136 | # :param smooth_values: if TRUE, values returned are EMA smoothed, otherwise raw 137 | # values indicator values 138 | """ 139 | self.indicators = list() 140 | 141 | def get_labels(self) -> list: 142 | """ 143 | Get labels for each indicator being managed. 144 | 145 | :return: List of label names 146 | """ 147 | # return [label[0] for label in self.indicators] 148 | labels = [] 149 | for label, indicator in self.indicators: 150 | indicator_label = indicator.label 151 | if isinstance(indicator_label, list): 152 | labels.extend(indicator_label) 153 | else: 154 | labels.append(indicator_label) 155 | return labels 156 | 157 | def add(self, name_and_indicator: Tuple[str, Union[Indicator, ExponentialMovingAverage]]) \ 158 | -> None: 159 | """ 160 | Add indicator to the list to be managed. 161 | 162 | :param name_and_indicator: tuple(name, indicator) 163 | :return: (void) 164 | """ 165 | self.indicators.append(name_and_indicator) 166 | 167 | def delete(self, index: Union[int, None]) -> None: 168 | """ 169 | Delete an indicator from the manager. 170 | 171 | :param index: index to delete (int or str) 172 | :return: (void) 173 | """ 174 | if isinstance(index, int): 175 | del self.indicators[index] 176 | else: 177 | self.indicators.remove(index) 178 | 179 | def pop(self, index: Union[int, None]) -> Union[float, None]: 180 | """ 181 | Pop indicator from manager. 182 | 183 | :param index: (int) index of indicator to pop 184 | :return: (name, indicator) 185 | """ 186 | if index is not None: 187 | return self.indicators.pop(index) 188 | else: 189 | return self.indicators.pop() 190 | 191 | def step(self, **kwargs) -> None: 192 | """ 193 | Update indicator with new step through environment. 194 | 195 | :param kwargs: Data passed to indicator for the update 196 | :return: 197 | """ 198 | for (name, indicator) in self.indicators: 199 | indicator.step(**kwargs) 200 | 201 | def reset(self) -> None: 202 | """ 203 | Reset all indicators being managed. 204 | 205 | :return: (void) 206 | """ 207 | for (name, indicator) in self.indicators: 208 | indicator.reset() 209 | 210 | def get_value(self) -> List[float]: 211 | """ 212 | Get all indicator values in the manager's inventory. 213 | 214 | :return: (list of floats) Indicator values for current time step 215 | """ 216 | values = [] 217 | for name, indicator in self.indicators: 218 | indicator_value = indicator.value 219 | if isinstance(indicator_value, list): 220 | values.extend(indicator_value) 221 | else: 222 | values.append(indicator_value) 223 | return values 224 | -------------------------------------------------------------------------------- /indicators/rsi.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | from indicators.indicator import Indicator 4 | 5 | 6 | class RSI(Indicator): 7 | """ 8 | Price change momentum indicator. Note: Scaled to [-1, 1] and not [0, 100]. 9 | """ 10 | 11 | def __init__(self, **kwargs): 12 | super().__init__(label='rsi', **kwargs) 13 | self.last_price = None 14 | self.ups = self.downs = 0. 15 | 16 | def __str__(self): 17 | return f"RSI: [ last_price = {self.last_price} | " \ 18 | f"ups = {self.ups} | downs = {self.downs} ]" 19 | 20 | def reset(self) -> None: 21 | """ 22 | Reset the indicator. 23 | 24 | :return: 25 | """ 26 | self.last_price = None 27 | self.ups = self.downs = 0. 28 | super().reset() 29 | 30 | def step(self, price: float) -> None: 31 | """ 32 | Update indicator value incrementally. 33 | 34 | :param price: midpoint price 35 | :return: 36 | """ 37 | if self.last_price is None: 38 | self.last_price = price 39 | return 40 | 41 | if np.isnan(price): 42 | print(f'Error: RSI.step() -> price is {price}') 43 | return 44 | 45 | if price == 0.: 46 | price_pct_change = 0. 47 | elif self.last_price == 0.: 48 | price_pct_change = 0. 49 | else: 50 | price_pct_change = round((price / self.last_price) - 1., 6) 51 | 52 | if np.isinf(price_pct_change): 53 | price_pct_change = 0. 54 | 55 | self.last_price = price 56 | 57 | if price_pct_change > 0.: 58 | self.ups += price_pct_change 59 | else: 60 | self.downs += price_pct_change 61 | 62 | self.all_history_queue.append(price_pct_change) 63 | 64 | # only pop off items if queue is done warming up 65 | if len(self.all_history_queue) <= self.window: 66 | return 67 | 68 | price_to_remove = self.all_history_queue.popleft() 69 | 70 | if price_to_remove > 0.: 71 | self.ups -= price_to_remove 72 | else: 73 | self.downs -= price_to_remove 74 | 75 | # Save current time step value for EMA, in case smoothing is enabled 76 | self._value = self.calculate() 77 | super().step(value=self._value) 78 | 79 | def calculate(self) -> float: 80 | """ 81 | Calculate price momentum imbalance. 82 | 83 | :return: imbalance in range of [-1, 1] 84 | """ 85 | mean_downs = abs(self.safe_divide(nom=self.downs, denom=self.window)) 86 | mean_ups = self.safe_divide(nom=self.ups, denom=self.window) 87 | gain = mean_ups - mean_downs 88 | loss = mean_ups + mean_downs 89 | return self.safe_divide(nom=gain, denom=loss) 90 | -------------------------------------------------------------------------------- /indicators/tests/test_indicators.py: -------------------------------------------------------------------------------- 1 | import unittest 2 | 3 | import numpy as np 4 | 5 | from gym_trading.utils.decorator import print_time 6 | from indicators.indicator import IndicatorManager 7 | from indicators.rsi import RSI 8 | from indicators.tns import TnS 9 | 10 | 11 | class IndicatorTestCases(unittest.TestCase): 12 | 13 | @print_time 14 | def test_rsi_up(self): 15 | indicator = RSI(window=10) 16 | prices = np.linspace(1, 5, 50) 17 | indicator.step(price=0) 18 | for price in prices: 19 | indicator.step(price=price) 20 | indicator_value = indicator.value 21 | self.assertEqual(float(1), indicator_value, 22 | msg='indicator_value is {} and should be {}'.format( 23 | indicator_value, float(1))) 24 | 25 | @print_time 26 | def test_rsi_down(self): 27 | indicator = RSI(window=10) 28 | prices = np.linspace(1, 5, 50)[::-1] 29 | indicator.step(price=0) 30 | for price in prices: 31 | indicator.step(price=price) 32 | indicator_value = indicator.value 33 | self.assertEqual(float(-1), indicator_value, 34 | msg='indicator_value is {} and should be {}'.format( 35 | indicator_value, float(-1))) 36 | 37 | @print_time 38 | def test_tns_up(self): 39 | indicator = TnS(window=10) 40 | buys = [10] * 3 + [0] * 7 41 | sells = [0] * 10 42 | indicator.step(buys=0, sells=0) 43 | for buy, sell in zip(buys, sells): 44 | indicator.step(buys=buy, sells=sell) 45 | indicator_value = indicator.value 46 | self.assertEqual(float(1), indicator_value, 47 | msg='indicator_value is {} and should be {}'.format( 48 | indicator_value, float(1))) 49 | 50 | @print_time 51 | def test_tns_down(self): 52 | indicator = TnS(window=10) 53 | buys = [0] * 10 54 | sells = [0] * 7 + [10] * 3 55 | indicator.step(buys=0, sells=0) 56 | for buy, sell in zip(buys, sells): 57 | indicator.step(buys=buy, sells=sell) 58 | indicator_value = indicator.value 59 | self.assertEqual(float(-1), indicator_value, 60 | msg='indicator_value is {} and should be {}'.format( 61 | indicator_value, float(-1))) 62 | 63 | @print_time 64 | def test_indicator_manager(self): 65 | im = IndicatorManager() 66 | for i in range(2, 5): 67 | name = 'tns_{}'.format(i) 68 | print("adding {}".format(name)) 69 | im.add((name, TnS(window=i))) 70 | 71 | buys = [0] * 10 72 | sells = [0] * 7 + [10] * 3 73 | im.step(buys=0, sells=0) 74 | for buy, sell in zip(buys, sells): 75 | im.step(buys=buy, sells=sell) 76 | indicator_values = im.get_value() 77 | self.assertEqual([float(-1)] * 3, indicator_values, 78 | msg='indicator_value is {} and should be {}'.format( 79 | indicator_values, float(-1))) 80 | 81 | @print_time 82 | def test_exponential_moving_average(self): 83 | indicator_ema = RSI(window=10, alpha=0.99) 84 | indicator = RSI(window=10, alpha=None) 85 | prices = np.concatenate((np.linspace(1, 5, 20)[::-1], np.linspace(1, 5, 20)), 86 | axis=0) 87 | 88 | indicator.step(price=0) 89 | for price in prices: 90 | indicator.step(price=price) 91 | indicator_ema.step(price=price) 92 | indicator_value = indicator.value 93 | indicator_ema_value = indicator_ema.value 94 | print("indicator_value: {:.6f} | ema: {:.6f}".format(indicator_value, 95 | indicator_ema_value)) 96 | 97 | self.assertNotAlmostEqual(indicator_value, indicator_ema_value, 98 | msg='indicator_value is {} and should be {}'.format( 99 | indicator_value, indicator_ema_value)) 100 | 101 | self.assertNotAlmostEqual(1., indicator_ema_value, 102 | msg='indicator_ema_value is {} and should be {}'.format( 103 | indicator_ema_value, 1.)) 104 | 105 | self.assertAlmostEqual(1., indicator_value, 106 | msg='indicator_value is {} and should be {}'.format( 107 | indicator_value, 1.)) 108 | 109 | @print_time 110 | def test_manager_ema(self): 111 | manager = IndicatorManager() 112 | alpha = [0.99, 0.999, 0.9999] 113 | windows = [5, 15] 114 | 115 | for window in windows: 116 | manager.add((f'RSI_{window}', RSI(window=window, alpha=alpha))) 117 | 118 | data_set = np.cumsum(np.random.rand(1000) - 0.45) * 10. 119 | for i, data in enumerate(data_set): 120 | manager.step(price=data) 121 | if i < max(windows) + 1: 122 | continue 123 | tmp = np.asarray(manager.get_value()) 124 | print(f"tmp.shape -> {tmp.shape}") 125 | self.assertIsNot(tmp.min(), np.nan, msg=f'ERROR: NAN number in tmp: P{tmp}') 126 | 127 | for i, window in enumerate(windows): 128 | print(f"window[{window}]\t= {tmp[i]}") 129 | 130 | print("Done.") 131 | 132 | 133 | if __name__ == '__main__': 134 | unittest.main() 135 | -------------------------------------------------------------------------------- /indicators/tns.py: -------------------------------------------------------------------------------- 1 | from indicators.indicator import Indicator 2 | 3 | 4 | class TnS(Indicator): 5 | """ 6 | Time and sales [trade flow] imbalance indicator 7 | """ 8 | 9 | def __init__(self, **kwargs): 10 | super().__init__(label='tns', **kwargs) 11 | self.ups = self.downs = 0. 12 | 13 | def __str__(self): 14 | return f"TNS: ups={self.ups} | downs={self.downs}" 15 | 16 | def reset(self) -> None: 17 | """ 18 | Reset indicator. 19 | """ 20 | self.ups = self.downs = 0. 21 | super().reset() 22 | 23 | def step(self, buys: float, sells: float) -> None: 24 | """ 25 | Update indicator with new transaction data. 26 | 27 | :param buys: buy transactions 28 | :param sells: sell transactions 29 | """ 30 | self.ups += abs(buys) 31 | self.downs += abs(sells) 32 | self.all_history_queue.append((buys, sells)) 33 | 34 | # only pop off items if queue is done warming up 35 | if len(self.all_history_queue) <= self.window: 36 | return 37 | 38 | buys_, sells_ = self.all_history_queue.popleft() 39 | self.ups -= abs(buys_) 40 | self.downs -= abs(sells_) 41 | 42 | # Save current time step value for EMA, in case smoothing is enabled 43 | self._value = self.calculate() 44 | super().step(value=self._value) 45 | 46 | def calculate(self) -> float: 47 | """ 48 | Calculate trade flow imbalance. 49 | 50 | :return: imbalance in range of [-1, 1] 51 | """ 52 | gain = round(self.ups - self.downs, 6) 53 | loss = round(self.ups + self.downs, 6) 54 | return self.safe_divide(nom=gain, denom=loss) 55 | -------------------------------------------------------------------------------- /recorder.py: -------------------------------------------------------------------------------- 1 | import asyncio 2 | import time 3 | from datetime import datetime as dt 4 | from multiprocessing import Process 5 | from threading import Timer 6 | 7 | from configurations import BASKET, LOGGER, SNAPSHOT_RATE 8 | from data_recorder.bitfinex_connector.bitfinex_client import BitfinexClient 9 | from data_recorder.coinbase_connector.coinbase_client import CoinbaseClient 10 | 11 | 12 | class Recorder(Process): 13 | 14 | def __init__(self, symbols): 15 | """ 16 | Constructor of Recorder. 17 | 18 | :param symbols: basket of securities to record... 19 | Example: symbols = [('BTC-USD, 'tBTCUSD')] 20 | """ 21 | super(Recorder, self).__init__() 22 | self.symbols = symbols 23 | self.timer_frequency = SNAPSHOT_RATE 24 | self.workers = dict() 25 | self.current_time = dt.now() 26 | self.daemon = False 27 | 28 | def run(self) -> None: 29 | """ 30 | New process created to instantiate limit order books for 31 | (1) Coinbase Pro, and 32 | (2) Bitfinex. 33 | 34 | Connections made to each exchange are made asynchronously thanks to asyncio. 35 | 36 | :return: void 37 | """ 38 | coinbase, bitfinex = self.symbols 39 | 40 | self.workers[coinbase] = CoinbaseClient(sym=coinbase) 41 | self.workers[bitfinex] = BitfinexClient(sym=bitfinex) 42 | 43 | self.workers[coinbase].start(), self.workers[bitfinex].start() 44 | 45 | Timer(5.0, self.timer_worker, 46 | args=(self.workers[coinbase], self.workers[bitfinex],)).start() 47 | 48 | tasks = asyncio.gather(*[self.workers[sym].subscribe() 49 | for sym in self.workers.keys()]) 50 | loop = asyncio.get_event_loop() 51 | LOGGER.info(f'Recorder: Gathered {len(self.workers.keys())} tasks') 52 | 53 | try: 54 | loop.run_until_complete(tasks) 55 | loop.close() 56 | [self.workers[sym].join() for sym in self.workers.keys()] 57 | LOGGER.info(f'Recorder: loop closed for {coinbase} and {bitfinex}.') 58 | 59 | except KeyboardInterrupt as e: 60 | LOGGER.info(f"Recorder: Caught keyboard interrupt. \n{e}") 61 | tasks.cancel() 62 | loop.close() 63 | [self.workers[sym].join() for sym in self.workers.keys()] 64 | 65 | finally: 66 | loop.close() 67 | LOGGER.info(f'Recorder: Finally done for {coinbase} and {bitfinex}.') 68 | 69 | def timer_worker(self, 70 | coinbaseClient: CoinbaseClient, 71 | bitfinexClient: BitfinexClient) -> None: 72 | """ 73 | Thread worker to be invoked every N seconds (e.g., configurations.SNAPSHOT_RATE) 74 | 75 | :param coinbaseClient: CoinbaseClient 76 | :param bitfinexClient: BitfinexClient 77 | :return: void 78 | """ 79 | Timer(self.timer_frequency, self.timer_worker, 80 | args=(coinbaseClient, bitfinexClient,)).start() 81 | self.current_time = dt.now() 82 | 83 | if coinbaseClient.book.done_warming_up & \ 84 | bitfinexClient.book.done_warming_up: 85 | """ 86 | This is the place to insert a trading model. 87 | You'll have to create your own. 88 | 89 | Example: 90 | orderbook_data = tuple(coinbaseClient.book, bitfinexClient.book) 91 | model = agent.dqn.Agent() 92 | fix_api = SomeFixAPI() 93 | action = model(orderbook_data) 94 | if action is buy: 95 | buy_order = create_order(pair, price, etc.) 96 | fix_api.send_order(buy_order) 97 | 98 | """ 99 | LOGGER.info(f'{coinbaseClient.sym} >> {coinbaseClient.book}') 100 | # The `render_book()` method returns a numpy array of the LOB's current state, 101 | # as well as resets the Order Flow Imbalance trackers. 102 | # The LOB snapshot is in a tabular format with columns as defined in 103 | # `render_lob_feature_names()` 104 | _ = coinbaseClient.book.render_book() 105 | _ = bitfinexClient.book.render_book() 106 | elif coinbaseClient.book.done_warming_up and not bitfinexClient.book.done_warming_up: 107 | LOGGER.info(f'Bitfinex - {bitfinexClient.sym} is warming up') 108 | _ = coinbaseClient.book.render_book() 109 | elif bitfinexClient.book.done_warming_up and not coinbaseClient.book.done_warming_up: 110 | LOGGER.info(f'Coinbase - {coinbaseClient.sym} is warming up') 111 | _ = bitfinexClient.book.render_book() 112 | else: 113 | LOGGER.info('Both Coinbase and Bitfinex are still warming up...') 114 | 115 | 116 | def main(): 117 | LOGGER.info(f'Starting recorder with basket = {BASKET}') 118 | for coinbase, bitfinex in BASKET: 119 | Recorder((coinbase, bitfinex)).start() 120 | LOGGER.info(f'Process started up for {coinbase}') 121 | time.sleep(9) 122 | 123 | 124 | if __name__ == "__main__": 125 | """ 126 | Entry point of application 127 | """ 128 | main() 129 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | matplotlib 2 | scikit-learn 3 | requests 4 | numpy 5 | pandas 6 | gym 7 | sortedcontainers 8 | websockets 9 | h5py 10 | more-itertools 11 | pymongo 12 | pytest 13 | python-dateutil 14 | pytz 15 | scipy 16 | tzlocal 17 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | import os 2 | 3 | from setuptools import setup 4 | 5 | cwd = os.path.dirname(os.path.realpath(__file__)) 6 | file = os.path.join(cwd, 'requirements.txt') 7 | with open(file) as f: 8 | dependencies = list(map(lambda x: x.replace("\n", ""), f.readlines())) 9 | 10 | with open("README.md", 'r') as f: 11 | long_description = f.read() 12 | 13 | setup(name='crypto_rl', 14 | version='0.2.3', 15 | description='Cryptocurrency LOB trading environment in gym format.', 16 | long_description=long_description, 17 | author='Jonathan Sadighian', 18 | url='https://github.com/sadighian/crypto-rl', 19 | install_requires=dependencies, 20 | packages=['agent', 'data_recorder', 'gym_trading', 'indicators']) 21 | --------------------------------------------------------------------------------