Reinforcement Learning

1.Noisy networks for exploration

post-thumbnail

2.WU-UCT (Watch the Unobserved: A simple approach to parallelizing monte carlo tree search)

post-thumbnail