RolloutSimulator is the simplest MDP or POMDP simulator. When
simulate is called, it simply simulates a single trajectory of the process and returns the discounted reward.
rs = RolloutSimulator() mdp = GridWorld() policy = RandomPolicy(mdp) r = simulate(rs, mdp, policy)
More examples can be found in the POMDPExamples Package
RolloutSimulator(rng, max_steps) RolloutSimulator(; <keyword arguments>)
A fast simulator that just returns the reward
The simulation will be terminated when either
- a terminal state is reached (as determined by
- the discount factor is as small as
- max_steps have been executed
- rng: A random number generator to use.
- eps: A small number; if γᵗ where γ is the discount factor and t is the time step becomes smaller than this, the simulation will be terminated.
- max_steps: The maximum number of steps to simulate.
Usage (optional arguments in brackets):
ro = RolloutSimulator() history = simulate(ro, pomdp, policy, [updater [, init_belief [, init_state]]])