Rollout

RolloutSimulator

RolloutSimulator is the simplest MDP or POMDP simulator. When simulate is called, it simply simulates a single trajectory of the process and returns the discounted reward.

rs = RolloutSimulator()
mdp = GridWorld()
policy = RandomPolicy(mdp)

r = simulate(rs, mdp, policy)

More examples can be found in the POMDPExamples Package

POMDPSimulators.RolloutSimulatorType
RolloutSimulator(rng, max_steps)
RolloutSimulator(; <keyword arguments>)

A fast simulator that just returns the reward

The simulation will be terminated when either

  1. a terminal state is reached (as determined by isterminal() or
  2. the discount factor is as small as eps or
  3. max_steps have been executed

Keyword arguments:

  • rng: A random number generator to use.
  • eps: A small number; if γᵗ where γ is the discount factor and t is the time step becomes smaller than this, the simulation will be terminated.
  • max_steps: The maximum number of steps to simulate.

Usage (optional arguments in brackets):

ro = RolloutSimulator()
history = simulate(ro, pomdp, policy, [updater [, init_belief [, init_state]]])

See also: HistoryRecorder, run_parallel

source