Stochastic Policies

Types for representing randomized policies:

StochasticPolicy samples actions from an arbitrary distribution.
UniformRandomPolicy samples actions uniformly (see RandomPolicy for a similar use)
CategoricalTabularPolicy samples actions from a categorical distribution with weights given by a ValuePolicy.
EpsGreedyPolicy uses epsilon-greedy action selection.

StochasticPolicy{D, RNG <: AbstractRNG}

Represents a stochastic policy. Action are sampled from an arbitrary distribution.

Constructor:

`StochasticPolicy(distribution; rng=Random.GLOBAL_RNG)`

Fields

CategoricalTabularPolicy

represents a stochastic policy sampling an action from a categorical distribution with weights given by a ValuePolicy

constructor:

CategoricalTabularPolicy(mdp::Union{POMDP,MDP}; rng=Random.GLOBAL_RNG)

Fields

EpsGreedyPolicy

represents an epsilon greedy policy, sampling a random action with a probability eps or sampling from a given stochastic policy otherwise.

constructor:

EpsGreedyPolicy(mdp::Union{MDP,POMDP}, eps::Float64; rng=Random.GLOBAL_RNG)