Alpha Vector Policy

Represents a policy with a set of alpha vectors (See AlphaVectorPolicy constructor docstring). In addition to finding the optimal action with action, the alpha vectors can be accessed with alphavectors or alphapairs.

Determining the estimated value and optimal action depends on calculating the dot product between alpha vectors and a belief vector. POMDPPolicies.beliefvec(pomdp, b) is used to create this vector and can be overridden for new belief types for efficiency.

POMDPPolicies.AlphaVectorPolicy — Type

AlphaVectorPolicy(pomdp::POMDP, alphas, action_map)

Construct a policy from alpha vectors.

Arguments

alphas: an |S| x (number of alpha vecs) matrix or a vector of alpha vectors.
action_map: a vector of the actions correponding to each alpha vector
AlphaVectorPolicy{P<:POMDP, A}

Represents a policy with a set of alpha vectors.

Use action to get the best action for a belief, and alphavectors and alphapairs to

Fields

pomdp::P the POMDP problem
n_states::Int the number of states in the POMDP
alphas::Vector{Vector{Float64}} the list of alpha vectors
action_map::Vector{A} a list of action corresponding to the alpha vectors

source

POMDPPolicies.alphavectors — Function

Return the alpha vectors.

source

POMDPPolicies.alphapairs — Function

Return an iterator of alpha vector-action pairs in the policy.

source

POMDPPolicies.beliefvec — Function

POMDPPolicies.beliefvec(m::POMDP, n_states::Int, b)

Return a vector-like representation of the belief b suitable for calculating the dot product with the alpha vectors.

source