Rollout

Rollout

RolloutSimulator

RolloutSimulator is the simplest MDP or POMDP simulator. When simulate is called, it simply simulates a single trajectory of the process and returns the dis counted reward.

rs = RolloutSimulator()
mdp = GridWorld()
policy = RandomPolicy(mdp)

r = simulate(rs, mdp, policy)

A fast simulator that just returns the reward

The simulation will be terminated when either

  1. a terminal state is reached (as determined by isterminal() or
  2. the discount factor is as small as eps or
  3. max_steps have been executed

Keyword Arguments: - eps - max_steps

Usage (optional arguments in brackets): ro = RolloutSimulator() history = simulate(ro, pomdp, policy, [updater [, initbelief [, initstate]]])

source