Alpha Vector Policy
Represents a policy with a set of alpha vectors (See AlphaVectorPolicy constructor docstring). In addition to finding the optimal action with action, the alpha vectors can be accessed with alphavectors or alphapairs.
Determining the estimated value and optimal action depends on calculating the dot product between alpha vectors and a belief vector. POMDPPolicies.beliefvec(pomdp, b) is used to create this vector and can be overridden for new belief types for efficiency.
POMDPPolicies.AlphaVectorPolicy — TypeAlphaVectorPolicy(pomdp::POMDP, alphas, action_map)Construct a policy from alpha vectors.
Arguments
alphas: an |S| x (number of alpha vecs) matrix or a vector of alpha vectors.action_map: a vector of the actions correponding to each alpha vectorAlphaVectorPolicy{P<:POMDP, A}
Represents a policy with a set of alpha vectors.
Use action to get the best action for a belief, and alphavectors and alphapairs to
Fields
pomdp::Pthe POMDP problemn_states::Intthe number of states in the POMDPalphas::Vector{Vector{Float64}}the list of alpha vectorsaction_map::Vector{A}a list of action corresponding to the alpha vectors
POMDPPolicies.alphavectors — FunctionReturn the alpha vectors.
POMDPPolicies.alphapairs — FunctionReturn an iterator of alpha vector-action pairs in the policy.
POMDPPolicies.beliefvec — FunctionPOMDPPolicies.beliefvec(m::POMDP, n_states::Int, b)Return a vector-like representation of the belief b suitable for calculating the dot product with the alpha vectors.