Max Planck Society - eDoc Server

http://edoc.mpg.de



Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning
Authors: Morimura, T.; Uchibe, E.; Yoshimoto, J.; Peters, J.; Doya, K.
Date of Publication (YYYY-MM-DD): 2010-02
Title of Journal: Neural Computation
Volume: 22
Issue / Number: 2
Start Page: 342
End Page: 376
Document Type: Article
ID: 548419.0