Morimura, T., E. Uchibe, J. Yoshimoto, J. Peters and K. Doya: Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning. In: Neural Computation 22, 2, 342-376 (2010).
url: http://www.mitpressjournals.org/doi/pdf/10.1162/neco.2009.12-08-922
localid: 5904
http://edoc.mpg.de
The Max Planck Society does not take any responsibility for the content of this export.