Tensor-Efficient High-Dimensional Q-learning
View PDF HTML (experimental) Abstract:High-dimensional reinforcement learning(RL) faces challenges with complex calculations and low sample efficiency in large state-action spaces. Q-learning algorithms struggle particularly with the curse of dimensionality, where the number of state-action pairs grows exponentially with problem size. While neural network-based approaches like Deep Q-Networks have shown success, they do not explicitly exploit problem structure. Many high-dimensional control tasks exhibit low-rank structure in their value functions, and tensor-based methods using low-rank decomposition offer parameter-efficient representations. However, existing tensor-based Q-learning methods focus on representation fidelity without leveraging this structure for exploration. We propose Tensor-Efficient Q-Learning (TEQL), which represents the Q-function as a low-rank CP tensor over discretized state-action spaces and exploits the tensor structure for uncertainty-aware exploration. TEQL incorporates Error-Uncertainty Guided Exploration (EUGE), which combines tensor approximation error with visit counts to guide action selection, along with frequency-aware regularization to stabilize updates. Under matched parameter budgets, experiments on classic control tasks demonstrate that TEQL outperforms both matrix-based low-rank methods and deep RL baselines in sample efficiency, making it suitable for resource-constrained applications where sampling costs are high. Comments: 61 pages, 7 figures. v2 updated to include additional experimental results and refined proofs Subjects: Machine Learning (cs.LG); Systems and Control (eess.SY) Cite as: arXiv:2511.03595 [cs.LG] (or arXiv:2511.03595v2 [cs.LG] for this version) https://doi.org/10.48550/arXiv.2511.03595 arXiv-issued DOI via DataCite Submission history From: Junyi Wu [view email] [v1] Wed, 5 Nov 2025 16:16:31 UTC (787 KB) [v2] Mon, 6 Apr 2026 23:25:08 UTC (2,048 KB)
No replies yet. Be first.