API

Exported Types

MOMDPs.MOMDPType
MOMDP{X, Y, A, O} <: POMDP{Tuple{X,Y},A,O}

Abstract base type for a mixed observable Markov decision process.

X: visible state type
Y: hidden state type
A: action type
O: observation type

Notation matching Ong, Sylvie CW, et al. "POMDPs for robotic tasks with mixed observability." Robotics: Science and systems. Vol. 5. No. 4. 2009. link

source
MOMDPs.POMDP_of_Discrete_MOMDPType
POMDP_of_Discrete_MOMDP{X,Y,A,O} <: POMDP{Tuple{X,Y},A,O}

This type is used to convert a Discrete MOMDP into a POMDP. There are functions defined for this POMDP that use the functions defined for the MOMDP type.

The only difference in the spaces is the observation space. If the original observation space was of size $\mathcal{O}$ and the visible state space was of size $\mathcal{X}$, then the observation space of the POMDP is $\mathcal{X} \times \mathcal{O}$.

source
MOMDPs.MOMDPAlphaVectorPolicyType
MOMDPAlphaVectorPolicy(momdp::MOMDP, alpha_vecs, action_map, vis_state_map)

Construct a policy from alpha vectors for a Mixed Observability Markov Decision Process (MOMDP).

Arguments

  • momdp::MOMDP: The MOMDP problem instance for which the policy is constructed.
  • alpha_vecs: An abstract vector of alpha vectors, where each alpha vector is a vector of floats representing the value function for a particular belief state.
  • action_map: A vector of actions corresponding to each alpha vector. Each action is associated with the alpha vector that prescribes it.
  • vis_state_map: A vector mapping visible states to their corresponding alpha vectors, used to determine which alpha vector applies to a given state.

Fields

  • momdp::MOMDP: The MOMDP problem instance, necessary for mapping states to locations in the alpha vectors.
  • n_states_x::Int: The number of visible states in the MOMDP.
  • n_states_y::Int: The number of hidden states in the MOMDP.
  • alphas::Vector{Vector{Vector{Float64}}}: A vector (of size $|\mathcal{X}|$) of vectors (number of alpha vectors) of vectors (number of alpha vectors) of alpha vectors (of size $|\mathcal{Y}|$). This structure holds the alpha vectors for each visible state.
  • action_map::Vector{Vector{A}}: A vector (of size |X|) of vectors of actions corresponding to the alpha vectors. Each action is associated with a specific alpha vector for a visible state.

This structure represents a policy that uses alpha vectors to determine the best action to take given a belief state in a MOMDP.

source
MOMDPs.MOMDPDiscreteUpdaterType
MOMDPDiscreteUpdater

An updater type for maintaining and updating discrete beliefs over hidden states in a Mixed Observability Markov Decision Process (MOMDP).

Constructor

MOMDPDiscreteUpdater(momdp::MOMDP)

Create a discrete belief updater for the given MOMDP.

Fields

  • momdp <: MOMDP: The MOMDP problem instance for which beliefs will be updated

Description

In a MOMDP, the state space is factored into visible states x (fully observable) and hidden states y (partially observable). This updater maintains beliefs only over the hidden states, since the visible states are assumed to be directly observed.

The updater implements the discrete Bayesian filter for belief updates, assuming:

  • Finite, discrete hidden state spaces
  • Known visible state transitions (x → x')
  • Probabilistic hidden state transitions that may depend on visible states
  • Observations that depend on both visible and hidden states

Usage

momdp = YourMOMDPProblem()
updater = MOMDPDiscreteUpdater(momdp)

# Initialize belief over hidden states
initial_dist = initialstate_y(momdp)

# Update belief after taking action and receiving observation
new_belief = update(updater, current_belief, action, observation, x, xp)

See Also

  • uniform_belief_y: Create uniform beliefs over hidden states
  • initialize_belief: Initialize beliefs from distributions
  • update: Perform belief updates using the MOMDP filter
source

Exported Functions

MOMDPs.transition_xFunction
transition_x(m::MOMDP{X,Y,A,O}, state::Tuple{X,Y}, action)

Return the transition distribution over the next visible state given the current state and action.

T_x(s, a, x′) = p(x′ | s, a) where s = (x,y)

source
MOMDPs.transition_yFunction
transition_y(m::MOMDP{X,Y,A,O}, state::Tuple{X,Y}, action, statep_visible)

Return the transition distribution over the next hidden state given the current state, action, and next visible state.

T_y(s, a, x′, y′) = p(y′ | s, a, x′) where s = (x,y)

source
MOMDPs.states_xFunction
states_x(problem::MOMDP)

Returns the complete visible state space of a MOMDP.

source
MOMDPs.states_yFunction
states_y(problem::MOMDP)

Returns the complete hidden state space of a MOMDP.

source
MOMDPs.stateindex_xFunction
stateindex_x(problem::MOMDP, s)

Return the integer index of the visible state x where s is a tuple of the form (x,y). Used for discrete models only.

source
MOMDPs.stateindex_yFunction
stateindex_y(problem::MOMDP, s)

Return the integer index of the hidden state y where s is a tuple of the form (x,y). Used for discrete models only.

source
MOMDPs.initialstate_yFunction
initialstate_y(problem::MOMDP, x)

Return the initial hidden state distribution conditioned on the visible state x.

source
MOMDPs.statetype_xFunction
statetype_x(t::Type)
statetype_x(p::MOMDP)

Return the visible state type for a MOMDP (the X in MOMDP{X,Y,A,O}).

source
MOMDPs.statetype_yFunction
statetype_y(t::Type)
statetype_y(p::MOMDP)

Return the hidden state type for a MOMDP (the Y in MOMDP{X,Y,A,O}).

source
MOMDPs.ordered_states_xFunction
ordered_states_x(momdp)

Return an AbstractVector of the visible states in a MOMDP ordered according to stateindex_x(momdp, s).

ordered_states_x(momdp) will always return a AbstractVector{X} v containing all of the visible states in states_x(momdp) in the order such that stateindex_x(momdp, v[i]) == i. You may wish to override this for your problem for efficiency.

source
MOMDPs.ordered_states_yFunction
ordered_states_y(momdp)

Return an AbstractVector of the hidden states in a MOMDP ordered according to stateindex_y(momdp, s).

ordered_states_y(momdp) will always return a AbstractVector{Y} v containing all of the hidden states in states_y(momdp) in the order such that stateindex_y(momdp, v[i]) == i. You may wish to override this for your problem for efficiency.

source
MOMDPs.is_y_prime_dependent_on_x_primeFunction
is_y_prime_dependent_on_x_prime(m::MOMDP)

Defines if the next hidden state y′ depends on the next visible state x′ given the current visible state x, hidden state y, and action a.

Return false if the conditional probability distribution satisfies: p(y′ | x, y, a, x′) = p(y′ | x, y, a).

source
MOMDPs.is_x_prime_dependent_on_yFunction
is_x_prime_dependent_on_y(m::MOMDP)

Defines if the next visible state x′ depends on the current hidden state y given the current visible state x and action a.

Returns false if the conditional probability distribution satisfies: p(x′ | x, y, a) = p(x′ | x, a).

source
MOMDPs.is_initial_distribution_independentFunction
is_initial_distribution_independent(m::MOMDP)

Defines whether the initial distributions of the visible state x and hidden state y are independent.

Returns true if the joint probability distribution satisfies: p(x, y) = p(x)p(y), meaning x and y are independent in the initial distribution.

source
MOMDPs.beliefvec_yFunction
beliefvec_y(m::MOMDP, n_states::Int, b)

Convert a belief distribution b over hidden states to a vector representation suitable for dot product operations with alpha vectors in a MOMDP.

Arguments

  • m::MOMDP: The MOMDP problem instance
  • n_states_y::Int: The number of hidden states in the MOMDP
  • b: The belief distribution over hidden states (supports various belief types)

Returns

  • A vector of length n_states_y where element i represents the probability of hidden state i

Supported Belief Types

  • SparseCat: Converts sparse categorical distribution to dense vector
  • AbstractVector: Returns the vector directly (with length assertion)
  • DiscreteBelief: Extracts the underlying probability vector
  • Deterministic: Creates a one-hot vector for the deterministic state

This function is used internally by alpha vector policies to convert belief distributions into the vector format required for computing dot products with alpha vectors.

source
MOMDPs.uniform_belief_yFunction
 uniform_belief_y(momdp)
 uniform_belief_y(up::MOMDPDiscreteUpdater)

Return a uniform DiscreteBelief over all hidden states in the MOMDP.

Arguments

  • momdp: A MOMDP problem instance, or
  • up::MOMDPDiscreteUpdater: A MOMDP discrete belief updater

Returns

  • A DiscreteBelief with equal probability 1/|Y| for each hidden state y ∈ Y

Description

This function creates a uniform prior belief over the hidden state space, which is often used as an initial belief when the true hidden state is unknown. In MOMDP problems, this represents maximum uncertainty about the hidden state while the visible state is assumed to be known.

The uniform belief is particularly useful for:

  • Initializing belief when no prior information is available
  • Baseline comparisons in experiments
  • Worst-case analysis of belief-dependent policies

Example

momdp = YourMOMDPProblem()
initial_belief = uniform_belief_y(momdp)
# or
updater = MOMDPDiscreteUpdater(momdp)
initial_belief = uniform_belief_y(updater)
source

Extended Functions

POMDPs.jl and POMDPTools.jl

POMDPs.statesMethod
POMDPs.states(p::MOMDP)

Helper function to return the full state space for discrete MOMDPs. The states are Tuple{X,Y} where X is the visible state and Y is the hidden state.

source
POMDPs.stateindexMethod
POMDPs.stateindex(p::MOMDP{X,Y,A,O}, s::Tuple{X,Y}) where {X,Y,A,O}

Helper function to return the index of the Tuple{X,Y} state for discrete MOMDPs.

source
POMDPs.transitionMethod
POMDPs.transition(p::MOMDP{X,Y,A,O}, s::Tuple{X,Y}, a::A) where {X,Y,A,O}

Helper function to return the full transition distribution for discrete MOMDPs. The states are Tuple{X,Y}. It uses transition_x and transition_y to construct the distribution.

source
POMDPs.initialstateMethod
initialstate(p::MOMDP{X,Y,A,O}) where {X,Y,A,O}

Helper function to return the initial state distribution for discrete MOMDPs. The states are Tuple{X,Y}. It uses initialstate_x and initialstate_y to construct the distribution.

source
POMDPs.observationsMethod
POMDPs.observations(p::POMDP_of_Discrete_MOMDP)

Returns the full observation space of a POMDPofDiscrete_MOMDP. The observations are Tuple{X,O} where X is the visible state and O is the observation.

source
POMDPs.observationMethod
POMDPs.observation(p::POMDP_of_Discrete_MOMDP{X, Y, A, O}, a::A, s::Tuple{X, Y}) where {X, Y, A, O}

Returns the full observation distribution for a POMDPofDiscrete_MOMDP. The observations are Tuple{X,O} where X is the visible state and O is the observation.

source
POMDPs.obsindexMethod
POMDPs.obsindex(p::POMDP_of_Discrete_MOMDP, o)

Returns the index of the Tuple{X,O} observation for a POMDPofDiscrete_MOMDP.

source
POMDPs.valueMethod
value(p::MOMDPAlphaVectorPolicy, b)

Calculate the value of belief state b for a MOMDP using alpha vectors.

This function computes the value by:

  1. Marginalizing the belief over the partially observable state to get $b(x)$ for each fully observable state $x$
  2. Computing the conditional belief $b(y \mid x) = b(x,y)/b(x)$ for each $x$
  3. Finding the maximum dot product between $b(y \mid x)$ and the alpha vectors for each $x$
  4. Summing $b(x) \cdot V(x, b(y \mid x))$ over all $x$ to get the total value

Arguments

  • p::MOMDPAlphaVectorPolicy: The alpha vector policy
  • b: The belief state over the joint state space (x,y)

Returns

  • The value of the belief state

Notes

This is not the most efficient way to get a value if we are operating in a true MOMDP framework. However, this keeps the structure of the code similar to the POMDPs.jl framework.

source
POMDPs.valueMethod
value(p::MOMDPAlphaVectorPolicy, b, x)

Calculate the value for a specific visible state x given a belief b over the joint state space.

Arguments

  • p::MOMDPAlphaVectorPolicy: The alpha vector policy
  • b: The belief distribution over the joint state space (x,y)
  • x: The specific visible state to evaluate

Returns

  • The value for the given visible state and belief

Description

This function computes the value by:

  1. Extracting the marginal belief over hidden states y for the given visible state x by evaluating pdf(b, (x,y)) for all y
  2. Finding the maximum dot product between this marginal belief and all alpha vectors associated with visible state x

This is more efficient than the general value(p, b) function when the visible state is known, as it only needs to consider alpha vectors for the specific x rather than marginalizing over all visible states.

source
POMDPs.actionMethod
action(p::MOMDPAlphaVectorPolicy, b)

Return the action prescribed by the MOMDP alpha-vector policy p for the belief b over the joint state space (x, y).

Heuristic

  1. Find the visible state x with the largest probability mass in b.
  2. Form the conditional distribution over y given that x.
  3. Among the alpha-vectors in p.alphas[x], pick the one with the largest dot product with that conditional distribution.
  4. Return the action associated with that alpha-vector.

Notes

  • When $b$ is not a pure distribution over a single $x$: Typically in a MOMDP, we assume $x$ (the "visible" state) is fully observed at runtime, so the belief over $(x, y)$ will place essentially all its probability mass on a single $x$. In that case, the above steps are effectively picking the single $x$ that we actually observed.

    If you are operating in a true MOMDP framework, you can implement a custom action function that takes in the visible state $x$ and the conditional distribution over $y$ given $x$. While this would result in the same action as the heuristic above, it will be more efficient. E.g.:

    # If ``x`` is known exactly at runtime and we have a distribution only over ``y``:
    function action(p::MOMDPAlphaVectorPolicy, x, by)
        x_idx = stateindex_x(p.momdp, x)
        # pick the alpha-vector among p.alphas[x_idx] that maximizes dot(alpha, by)
        ...
    end
  • In case your solver or simulator still gives a multi-modal distribution over different $x$ (which can happen in generic POMDP frameworks), the code here picks the $x$ that has the largest total probability mass in the belief. While this heuristic might be sufficient for some problems, we recommend implementing a custom action function that performs a one-step lookahead using the value function.

source
POMDPs.actionMethod
action(p::MOMDPAlphaVectorPolicy, b, x)

Return the action prescribed by the MOMDP alpha-vector policy p for the belief b over the hidden states, given the known visible state x.

Arguments

  • p::MOMDPAlphaVectorPolicy: The alpha vector policy
  • b: The belief distribution over hidden states y
  • x: The known visible state

Returns

  • The action for the given visible state and belief over hidden states

Description

This function assumes the visible state x is fully observed and known. It:

  1. Converts the belief b to a vector representation over hidden states
  2. Among all alpha vectors associated with visible state x, finds the one with the maximum dot product with the belief vector
  3. Returns the action associated with that optimal alpha vector

This is more efficient than the version without explicit x when the visible state is known at runtime, as it avoids the need to infer x from the belief distribution.

source
POMDPTools.Policies.actionvaluesMethod
actionvalues(p::MOMDPAlphaVectorPolicy, b, x)

Compute the action values (Q-values) for all actions given a belief b over hidden states and a known visible state x.

Arguments

  • p::MOMDPAlphaVectorPolicy: The alpha vector policy
  • b: The belief distribution over hidden states y
  • x: The known visible state

Returns

  • A vector of action values where the ith element is the Q-value for action i

Description

This function performs a one-step lookahead to compute action values by:

  1. For each action a and each hidden state y in the belief support:
    • Computing the immediate reward R(x,y,a)
    • For each possible next visible state x' and hidden state y':
      • For each possible observation o:
        • Updating the belief to get b'
        • Computing the value using the alpha vectors for x'
        • Accumulating the discounted expected future value
  2. Summing over all transitions weighted by their probabilities

The resulting Q-values can be used for action selection or policy evaluation when the visible state is known.

source
POMDPs.initialize_beliefMethod
initialize_belief(bu::MOMDPDiscreteUpdater, dist::Any)

Initialize a discrete belief over hidden states from a given distribution.

Arguments

  • bu::MOMDPDiscreteUpdater: The MOMDP discrete belief updater
  • dist: A distribution over hidden states to initialize from (supports various distribution types)

Returns

  • A DiscreteBelief over the hidden state space with probabilities initialized from dist

Description

This function creates a discrete belief representation over the hidden states y suitable for use with MOMDP belief update operations. The conversion process:

  1. Creates a zero-initialized probability vector over all hidden states in the MOMDP
  2. For each state y in the support of the input distribution, extracts the probability pdf(dist, y) and assigns it to the corresponding index in the belief vector
  3. Returns a DiscreteBelief object that can be used with the MOMDP updater

Supported Distribution Types

The function can handle various distribution types through the generic pdf interface:

  • Discrete distributions (e.g., Categorical, DiscreteUniform)
  • Custom distributions that implement pdf and support
  • Sparse distributions with limited support

Usage Examples

updater = MOMDPDiscreteUpdater(momdp)

# From a uniform distribution
uniform_dist = DiscreteUniform(1, length(states_y(momdp)))
belief = initialize_belief(updater, uniform_dist)

# From a sparse categorical distribution  
sparse_dist = SparseCat([state1, state3], [0.7, 0.3])
belief = initialize_belief(updater, sparse_dist)

Implementation Notes

  • Uses stateindex_y to map hidden states to belief vector indices
  • Assumes the distribution is over individual hidden states, not joint (x,y) states
  • The resulting belief is properly normalized if the input distribution is normalized
source
POMDPs.updateMethod
update(bu::MOMDPDiscreteUpdater, b::DiscreteBelief, a, o, x, xp)

Update a discrete belief over hidden states using the MOMDP belief update equation.

Arguments

  • bu::MOMDPDiscreteUpdater: The MOMDP discrete belief updater
  • b::DiscreteBelief: The current belief over hidden states y
  • a: The action taken
  • o: The observation received after taking action a
  • x: The previous visible state
  • xp: The current visible state

Returns

  • A new DiscreteBelief representing the updated belief over hidden states

Description

This function implements the discrete Bayesian filter for MOMDPs, which updates beliefs over hidden states given knowledge of visible state transitions.

The function iterates through all current hidden states with non-zero probability, computes transition probabilities to next hidden states, weights by observation probabilities, and normalizes the result.

Errors

Throws an error if the updated belief probabilities sum to zero, which indicates an impossible observation given the current belief and action.

source
POMDPs.updateMethod
update(bu::MOMDPDiscreteUpdater, b::Any, a, o, x, xp)

This is a convenience method that handles arbitrary belief types by first calling initialize_belief to convert them to a DiscreteBelief, then performing the standard discrete belief update.

source

Internal Functions

MOMDPs.alphapairsFunction

Return an iterator of alpha vector-action pairs in the policy, given a visible state.

source