Correlated Attention in Transformers for Multivariate Time Series

November 20, 2023 · Declared Dead · 🏛 IEEE International Conference on Acoustics, Speech, and Signal Processing

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Quang Minh Nguyen, Lam M. Nguyen, Subhro Das arXiv ID 2311.11959 Category cs.LG: Machine Learning Cross-listed cs.AI Citations 6 Venue IEEE International Conference on Acoustics, Speech, and Signal Processing Last Checked 4 months ago

Abstract

Multivariate time series (MTS) analysis prevails in real-world applications such as finance, climate science and healthcare. The various self-attention mechanisms, the backbone of the state-of-the-art Transformer-based models, efficiently discover the temporal dependencies, yet cannot well capture the intricate cross-correlation between different features of MTS data, which inherently stems from complex dynamical systems in practice. To this end, we propose a novel correlated attention mechanism, which not only efficiently captures feature-wise dependencies, but can also be seamlessly integrated within the encoder blocks of existing well-known Transformers to gain efficiency improvement. In particular, correlated attention operates across feature channels to compute cross-covariance matrices between queries and keys with different lag values, and selectively aggregate representations at the sub-series level. This architecture facilitates automated discovery and representation learning of not only instantaneous but also lagged cross-correlations, while inherently capturing time series auto-correlation. When combined with prevalent Transformer baselines, correlated attention mechanism constitutes a better alternative for encoder-only architectures, which are suitable for a wide range of tasks including imputation, anomaly detection and classification. Extensive experiments on the aforementioned tasks consistently underscore the advantages of correlated attention mechanism in enhancing base Transformer models, and demonstrate our state-of-the-art results in imputation, anomaly detection and classification.