Rollout
RolloutSimulator
RolloutSimulator
is the simplest MDP or POMDP simulator. When simulate
is called, it simply simulates a single trajectory of the process and returns the discounted reward.
rs = RolloutSimulator()
mdp = GridWorld()
policy = RandomPolicy(mdp)
r = simulate(rs, mdp, policy)
More examples can be found in the POMDPExamples Package
POMDPSimulators.RolloutSimulator
— TypeRolloutSimulator(rng, max_steps)
RolloutSimulator(; <keyword arguments>)
A fast simulator that just returns the reward
The simulation will be terminated when either
- a terminal state is reached (as determined by
isterminal()
or - the discount factor is as small as
eps
or - max_steps have been executed
Keyword arguments:
- rng: A random number generator to use.
- eps: A small number; if γᵗ where γ is the discount factor and t is the time step becomes smaller than this, the simulation will be terminated.
- max_steps: The maximum number of steps to simulate.
Usage (optional arguments in brackets):
ro = RolloutSimulator()
history = simulate(ro, pomdp, policy, [updater [, init_belief [, init_state]]])
See also: HistoryRecorder
, run_parallel