Morimura, T., E. Uchibe, J. Yoshimoto, J. Peters and K. Doya: Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning. In: Neural Computation 22, 2, 342-376 (2010).
url: http://www.mitpressjournals.org/doi/pdf/10.1162/neco.2009.12-08-922
localid: 5904
Peters, J. and S. Schaal: Policy Learning for Motor Skills. In: Neural Information Processing: 14th International Conference ICONIP 2007 (2008) 233-242.
url: http://conf.lsse.kyutech.ac.jp/~iconip2007/iconip2007main.html
localid: 4869
http://edoc.mpg.de
The Max Planck Society does not take any responsibility for the content of this export.