Rollout
RolloutSimulator
RolloutSimulator is the simplest MDP or POMDP simulator. When simulate is called, it simply simulates a single trajectory of the process and returns the dis counted reward.
rs = RolloutSimulator()
mdp = GridWorld()
policy = RandomPolicy(mdp)
r = simulate(rs, mdp, policy)POMDPSimulators.RolloutSimulator — Type.A fast simulator that just returns the reward
The simulation will be terminated when either
- a terminal state is reached (as determined by
isterminal()or - the discount factor is as small as
epsor - max_steps have been executed
Keyword Arguments: - eps - max_steps
Usage (optional arguments in brackets): ro = RolloutSimulator() history = simulate(ro, pomdp, policy, [updater [, initbelief [, initstate]]])