Alpha Vector Policy

Represents a policy with a set of alpha vectors (See AlphaVectorPolicy constructor docstring). In addition to finding the optimal action with action, the alpha vectors can be accessed with alphavectors or alphapairs.

Determining the estimated value and optimal action depends on calculating the dot product between alpha vectors and a belief vector. POMDPPolicies.beliefvec(pomdp, b) is used to create this vector and can be overridden for new belief types for efficiency.

POMDPPolicies.AlphaVectorPolicyType
AlphaVectorPolicy(pomdp::POMDP, alphas, action_map)

Construct a policy from alpha vectors.

Arguments

  • alphas: an |S| x (number of alpha vecs) matrix or a vector of alpha vectors.

  • action_map: a vector of the actions correponding to each alpha vector

    AlphaVectorPolicy{P<:POMDP, A}

Represents a policy with a set of alpha vectors.

Use action to get the best action for a belief, and alphavectors and alphapairs to

Fields

  • pomdp::P the POMDP problem
  • n_states::Int the number of states in the POMDP
  • alphas::Vector{Vector{Float64}} the list of alpha vectors
  • action_map::Vector{A} a list of action corresponding to the alpha vectors
source
POMDPPolicies.beliefvecFunction
POMDPPolicies.beliefvec(m::POMDP, n_states::Int, b)

Return a vector-like representation of the belief b suitable for calculating the dot product with the alpha vectors.

source