Denoising Stock Price Data with Discrete Wavelet Transform (Python Tutorial)
Data sourced with Yahoo Finance
When reading this article “Stock Forecasting using M-Band Wavelet-Based SVR and RNN-LSTMs Models” by Nguyen, Rahimyar and Wang and learned something that is so obvious, it made perfect sense and made me wonder why I hadn’t used this approach before. Starting on page 5 are a set of tables that compare result metrics for different ML approaches for stock price forecasting with and without denoised data. The bottom line is this: Denoised data has a lower error rate across different forecasting models compared to noisy raw data.
So I decided I wanted to learn their approach using Discrete Wavelet Transform (DWT) for denoising stock price data so I can use it in the future to denoise data for ML price prediction. And that’s what this post is about.
This story is solely for general information purposes, and should not be relied upon for trading recommendations or financial advice. Source code and information is provided for educational purposes only, and should not be relied upon to make an investment decision. Please review my full cautionary guidance before continuing.
What is Discrete Wavelet Transform?
A deep-dive into Wavelet Transform theory would extend the scope of this post but here a high-level overview:
Discrete Wavelet Transform (DWT) is a approach used in mathematics, statistics, signal processing (and trading!).
A “Wavelet” is a mathematical function used to represent a scaled and shifted version of a signal - or in our case a price curve. Wavelets are small waves of a limited duration that are used to decompose the signal into components with different frequency bands.
The original function used in the analysis is called a “Mother Wavelet” (ψ) whereas the wavelet used to create an approximation to the signal is called a “Scaling Function (φ)”.
Here the high-level steps that are taken during the DWT process:
The stock price signal is passed through a series of filters to decompose it into its approximation (A) and detail (D) components.
The approximation component represents the low-frequency part of the signal. It is obtained by passing the signal through a low-pass filter and then down-sampling the result by a factor of 2 (i.e., keeping every second sample). This component captures the general trend or smooth variations in the data, and it can be further decomposed into finer level approximation and detail components.
The detail component represents the high-frequency part of the signal. It is obtained by passing the signal through a high-pass filter and then down-sampling the result by a factor of 2. This component captures the rapid changes, edges, or transient features in the signal.
In a Multi-Resolution Analysis, the process of decomposition can be applied repeatedly to the approximation coefficient. At each level of decomposition, a clearer view of the underlying trend (approximation coefficients) and noise/short-term fluctuations (detail coefficients) should become apparent.
When using Thresholding, a thresholding value is applied to the detail coefficients. Thresholding sets the small detail coefficients (presumed to be noise) to zero, keeping only the significant detail coefficients.
In the Reconstruction phase, the modified detail coefficient is combined with the original approximation coefficient to reconstruct a denoised stock price “signal”. This is done by using an inverse DWT process.
To learn more about DWT, check out this post by sciencedirect.com.
What is M-Band Wavelet Transform?
The article I referenced in the introduction mentioned an M-Band Wavelet Transform (Multi-band). Unlike the standard DWT which uses 2 bands, the M-Band Wavelet Transform decomposes the signal into M different bands at each level.
The M-Band Transform is a more sophisticated and more fine-grained denoising method but I haven’t found a Python library that implements this approach. If anyone knows one that does, please leave a comment. During my testing I found that conventional DWT provided pretty good results when using different the ‘db6’ wavelet in combination with a low threshold scale.
What is PyWavelet?
PyWavelets is an open source wavelet transform library for Python. PyWavelets provides a large number of built-in wavelet filters and supports various types of wavelet transforms.
The library offers the following features:
Multilevel Decomposition
2D Transform Support
Signal Extension Modes
Inverse Transform and
Support for Real and Complex Data.
For more information about PyWavelets, check out the documentation page.
Here the link to the GitHub page.
In this tutorial we are going to use PyWavelets DWT for denoising stock price data.