Advanced Deep Reinforcement Learning System for Trade Execution: Part III: DQN Implementation
Photo by Steve Johnson on Unsplash.com
In one of my last posts we discussed the topic of feature engineering for building a Deep Reinforcement Learning (DRL) system for trade execution. In this part of the series, we are going to discuss the implementation of a DRL variant that uses Deep-Q Learning.
Suggested Reads:
Advanced Deep Reinforcement Learning System for Trade Execution: Part I: Foundation Concepts
Advanced Deep Reinforcement Learning System for Trade Execution: Part II: Feature Engineering
This story is solely for general information purposes, and should not be relied upon for trading recommendations or financial advice. Source code and information is provided for educational purposes only, and should not be relied upon to make an investment decision. Please review my full cautionary guidance before continuing.
What’s a Deep Q-Network?
The core idea of a Deep Q-Network (DQN) is to solve reinforcement learning problems by finding the optimal action - Q-value function, which provides the expected reward for a certain action.
The DQN uses a single neural network (NN), that takes the environment’s state (e.g. market data, balance and shares held) as an input and outputs a Q-value for each possible action (e.g. buy, hold or sell). The DQ Agent then selects the action that provided the highest Q-value.
The action is then applied to the environment, so for example a buy or a sell order is applied to the trading account balance at the current step. The new environment state is then observed and passed back to the DQ Agent.
For the purpose of this tutorial we will not yet implement an Actor-Critic approach. In our DQN, the agent will both select actions (Actor role) and evaluate them (Critic role). We will implement the Actor-Critic concept in a future post.
The DQN also uses several feature like Experience Replay and Exploration vs. Exploitation which will be explained in the next sections.
Take the emotions out of trading and free up hours of screen time by letting the algorithms do the work for you. Join Algohive and get an exclusive 50% discount on our membership.