Alpha Vector Policy
Represents a policy with a set of alpha vectors (See AlphaVectorPolicy
constructor docstring). In addition to finding the optimal action with action
, the alpha vectors can be accessed with alphavectors
or alphapairs
.
Determining the estimated value and optimal action depends on calculating the dot product between alpha vectors and a belief vector. POMDPPolicies.beliefvec(pomdp, b)
is used to create this vector and can be overridden for new belief types for efficiency.
POMDPPolicies.AlphaVectorPolicy
— TypeAlphaVectorPolicy(pomdp::POMDP, alphas, action_map)
Construct a policy from alpha vectors.
Arguments
alphas
: an |S| x (number of alpha vecs) matrix or a vector of alpha vectors.action_map
: a vector of the actions correponding to each alpha vectorAlphaVectorPolicy{P<:POMDP, A}
Represents a policy with a set of alpha vectors.
Use action
to get the best action for a belief, and alphavectors
and alphapairs
to
Fields
pomdp::P
the POMDP problemn_states::Int
the number of states in the POMDPalphas::Vector{Vector{Float64}}
the list of alpha vectorsaction_map::Vector{A}
a list of action corresponding to the alpha vectors
POMDPPolicies.alphavectors
— FunctionReturn the alpha vectors.
POMDPPolicies.alphapairs
— FunctionReturn an iterator of alpha vector-action pairs in the policy.
POMDPPolicies.beliefvec
— FunctionPOMDPPolicies.beliefvec(m::POMDP, n_states::Int, b)
Return a vector-like representation of the belief b
suitable for calculating the dot product with the alpha vectors.