Accelerated Gradient Temporal Difference Learning

November 28, 2016 Β· Declared Dead Β· πŸ› AAAI Conference on Artificial Intelligence

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Yangchen Pan, Adam White, Martha White arXiv ID 1611.09328 Category cs.AI: Artificial Intelligence Cross-listed cs.LG, stat.ML Citations 29 Venue AAAI Conference on Artificial Intelligence Last Checked 4 months ago
Abstract
The family of temporal difference (TD) methods span a spectrum from computationally frugal linear methods like TD(Ξ») to data efficient least squares methods. Least square methods make the best use of available data directly computing the TD solution and thus do not require tuning a typically highly sensitive learning rate parameter, but require quadratic computation and storage. Recent algorithmic developments have yielded several sub-quadratic methods that use an approximation to the least squares TD solution, but incur bias. In this paper, we propose a new family of accelerated gradient TD (ATD) methods that (1) provide similar data efficiency benefits to least-squares methods, at a fraction of the computation and storage (2) significantly reduce parameter sensitivity compared to linear TD methods, and (3) are asymptotically unbiased. We illustrate these claims with a proof of convergence in expectation and experiments on several benchmark domains and a large-scale industrial energy allocation domain.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Artificial Intelligence

Died the same way β€” πŸ‘» Ghosted