Rollout

RolloutSimulator

RolloutSimulator is the simplest MDP or POMDP simulator. When simulate is called, it simply simulates a single trajectory of the process and returns the discounted reward.

rs = RolloutSimulator()
mdp = GridWorld()
policy = RandomPolicy(mdp)

r = simulate(rs, mdp, policy)

More examples can be found in the POMDPExamples Package

POMDPSimulators.RolloutSimulator — Type

RolloutSimulator(rng, max_steps)
RolloutSimulator(; <keyword arguments>)

A fast simulator that just returns the reward

The simulation will be terminated when either

a terminal state is reached (as determined by isterminal() or
the discount factor is as small as eps or
max_steps have been executed

Keyword arguments:

rng: A random number generator to use.
eps: A small number; if γᵗ where γ is the discount factor and t is the time step becomes smaller than this, the simulation will be terminated.
max_steps: The maximum number of steps to simulate.

Usage (optional arguments in brackets):

ro = RolloutSimulator()
history = simulate(ro, pomdp, policy, [updater [, init_belief [, init_state]]])

source