Trading with the Momentum Transformer: An Intelligent and Interpretable Architecture

Image credit: Pexels

Abstract

We introduce the Momentum Transformer, an attention-based deep learning architecture which outperforms benchmark momentum and mean-reversion trading strategies. Unlike state-of-the-art Long Short-Term Memory (LSTM) architectures, which are sequential in nature, the attention mechanism provides our architecture with a direct connection to all previous time-steps. Our architecture enables us to learn longer-term dependencies, improves performance when considering returns net of transaction costs and naturally adapts to new market regimes, such as during the SARS-CoV-2 crisis. The Momentum Transformer is inherently interpretable, providing us with greater insights into our deep learning momentum trading strategy, including how it blends different classical strategies and the past time-steps which are of the greatest significance to the model.

Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.
Kieran Wood
Kieran Wood
DPhil in Machine Learning

My research interests include deep learning for time-series forecasting, momentum trading and Bayesian deep learning.