API

Exported Types

MOMDPs.MOMDP — Type

MOMDP{X, Y, A, O} <: POMDP{Tuple{X,Y},A,O}

Abstract base type for a mixed observable Markov decision process.

X: visible state type
Y: hidden state type
A: action type
O: observation type

Notation matching Ong, Sylvie CW, et al. "POMDPs for robotic tasks with mixed observability." Robotics: Science and systems. Vol. 5. No. 4. 2009. link

source

MOMDPs.POMDP_of_Discrete_MOMDP — Type

POMDP_of_Discrete_MOMDP{X,Y,A,O} <: POMDP{Tuple{X,Y},A,O}

This type is used to convert a Discrete MOMDP into a POMDP. There are functions defined for this POMDP that use the functions defined for the MOMDP type.

The only difference in the spaces is the observation space. If the original observation space was of size $\mathcal{O}$ and the visible state space was of size $\mathcal{X}$, then the observation space of the POMDP is $\mathcal{X} \times \mathcal{O}$.

source

MOMDPs.MOMDPAlphaVectorPolicy — Type

MOMDPAlphaVectorPolicy(momdp::MOMDP, alpha_vecs, action_map, vis_state_map)

Construct a policy from alpha vectors for a Mixed Observability Markov Decision Process (MOMDP).

Arguments

momdp::MOMDP: The MOMDP problem instance for which the policy is constructed.
alpha_vecs: An abstract vector of alpha vectors, where each alpha vector is a vector of floats representing the value function for a particular belief state.
action_map: A vector of actions corresponding to each alpha vector. Each action is associated with the alpha vector that prescribes it.
vis_state_map: A vector mapping visible states to their corresponding alpha vectors, used to determine which alpha vector applies to a given state.

Fields

momdp::MOMDP: The MOMDP problem instance, necessary for mapping states to locations in the alpha vectors.
n_states_x::Int: The number of visible states in the MOMDP.
n_states_y::Int: The number of hidden states in the MOMDP.
alphas::Vector{Vector{Vector{Float64}}}: A vector (of size $|\mathcal{X}|$) of vectors (number of alpha vectors) of vectors (number of alpha vectors) of alpha vectors (of size $|\mathcal{Y}|$). This structure holds the alpha vectors for each visible state.
action_map::Vector{Vector{A}}: A vector (of size |X|) of vectors of actions corresponding to the alpha vectors. Each action is associated with a specific alpha vector for a visible state.

This structure represents a policy that uses alpha vectors to determine the best action to take given a belief state in a MOMDP.

source

MOMDPs.MOMDPDiscreteUpdater — Type

MOMDPDiscreteUpdater

An updater type for maintaining and updating discrete beliefs over hidden states in a Mixed Observability Markov Decision Process (MOMDP).

Constructor

MOMDPDiscreteUpdater(momdp::MOMDP)

Create a discrete belief updater for the given MOMDP.

Fields

momdp <: MOMDP: The MOMDP problem instance for which beliefs will be updated

Description

In a MOMDP, the state space is factored into visible states x (fully observable) and hidden states y (partially observable). This updater maintains beliefs only over the hidden states, since the visible states are assumed to be directly observed.

The updater implements the discrete Bayesian filter for belief updates, assuming:

Finite, discrete hidden state spaces
Known visible state transitions (x → x')
Probabilistic hidden state transitions that may depend on visible states
Observations that depend on both visible and hidden states

Usage

momdp = YourMOMDPProblem()
updater = MOMDPDiscreteUpdater(momdp)

# Initialize belief over hidden states
initial_dist = initialstate_y(momdp)

# Update belief after taking action and receiving observation
new_belief = update(updater, current_belief, action, observation, x, xp)

See Also

uniform_belief_y: Create uniform beliefs over hidden states
initialize_belief: Initialize beliefs from distributions
update: Perform belief updates using the MOMDP filter

source

Exported Functions

MOMDPs.transition_x — Function

transition_x(m::MOMDP{X,Y,A,O}, state::Tuple{X,Y}, action)

Return the transition distribution over the next visible state given the current state and action.

T_x(s, a, x′) = p(x′ | s, a) where s = (x,y)

source

MOMDPs.transition_y — Function

transition_y(m::MOMDP{X,Y,A,O}, state::Tuple{X,Y}, action, statep_visible)

Return the transition distribution over the next hidden state given the current state, action, and next visible state.

T_y(s, a, x′, y′) = p(y′ | s, a, x′) where s = (x,y)

source

MOMDPs.states_x — Function

states_x(problem::MOMDP)

Returns the complete visible state space of a MOMDP.

source

MOMDPs.states_y — Function

states_y(problem::MOMDP)

Returns the complete hidden state space of a MOMDP.

source

MOMDPs.stateindex_x — Function

stateindex_x(problem::MOMDP, s)

Return the integer index of the visible state x where s is a tuple of the form (x,y). Used for discrete models only.

source

MOMDPs.stateindex_y — Function

stateindex_y(problem::MOMDP, s)

Return the integer index of the hidden state y where s is a tuple of the form (x,y). Used for discrete models only.

source

MOMDPs.initialstate_x — Function

initialstate_x(problem::MOMDP)

Return the initial visible state distribution.

source

MOMDPs.initialstate_y — Function

initialstate_y(problem::MOMDP, x)

Return the initial hidden state distribution conditioned on the visible state x.

source

MOMDPs.statetype_x — Function

statetype_x(t::Type)
statetype_x(p::MOMDP)

Return the visible state type for a MOMDP (the X in MOMDP{X,Y,A,O}).

source

MOMDPs.statetype_y — Function

statetype_y(t::Type)
statetype_y(p::MOMDP)

Return the hidden state type for a MOMDP (the Y in MOMDP{X,Y,A,O}).

source

MOMDPs.ordered_states_x — Function

ordered_states_x(momdp)

Return an AbstractVector of the visible states in a MOMDP ordered according to stateindex_x(momdp, s).

ordered_states_x(momdp) will always return a AbstractVector{X} v containing all of the visible states in states_x(momdp) in the order such that stateindex_x(momdp, v[i]) == i. You may wish to override this for your problem for efficiency.

source

MOMDPs.ordered_states_y — Function

ordered_states_y(momdp)

Return an AbstractVector of the hidden states in a MOMDP ordered according to stateindex_y(momdp, s).

ordered_states_y(momdp) will always return a AbstractVector{Y} v containing all of the hidden states in states_y(momdp) in the order such that stateindex_y(momdp, v[i]) == i. You may wish to override this for your problem for efficiency.

source

MOMDPs.is_y_prime_dependent_on_x_prime — Function

is_y_prime_dependent_on_x_prime(m::MOMDP)

Defines if the next hidden state y′ depends on the next visible state x′ given the current visible state x, hidden state y, and action a.

Return false if the conditional probability distribution satisfies: p(y′ | x, y, a, x′) = p(y′ | x, y, a).

source

MOMDPs.is_x_prime_dependent_on_y — Function

is_x_prime_dependent_on_y(m::MOMDP)

Defines if the next visible state x′ depends on the current hidden state y given the current visible state x and action a.

Returns false if the conditional probability distribution satisfies: p(x′ | x, y, a) = p(x′ | x, a).

source

MOMDPs.is_initial_distribution_independent — Function

is_initial_distribution_independent(m::MOMDP)

Defines whether the initial distributions of the visible state x and hidden state y are independent.

Returns true if the joint probability distribution satisfies: p(x, y) = p(x)p(y), meaning x and y are independent in the initial distribution.

source

MOMDPs.beliefvec_y — Function

beliefvec_y(m::MOMDP, n_states::Int, b)

Convert a belief distribution b over hidden states to a vector representation suitable for dot product operations with alpha vectors in a MOMDP.

Arguments

m::MOMDP: The MOMDP problem instance
n_states_y::Int: The number of hidden states in the MOMDP
b: The belief distribution over hidden states (supports various belief types)

Returns

A vector of length n_states_y where element i represents the probability of hidden state i

Supported Belief Types

SparseCat: Converts sparse categorical distribution to dense vector
AbstractVector: Returns the vector directly (with length assertion)
DiscreteBelief: Extracts the underlying probability vector
Deterministic: Creates a one-hot vector for the deterministic state

This function is used internally by alpha vector policies to convert belief distributions into the vector format required for computing dot products with alpha vectors.

source

MOMDPs.uniform_belief_y — Function

 uniform_belief_y(momdp)
 uniform_belief_y(up::MOMDPDiscreteUpdater)

Return a uniform DiscreteBelief over all hidden states in the MOMDP.

Arguments

momdp: A MOMDP problem instance, or
up::MOMDPDiscreteUpdater: A MOMDP discrete belief updater

Returns

A DiscreteBelief with equal probability 1/|Y| for each hidden state y ∈ Y

Description

This function creates a uniform prior belief over the hidden state space, which is often used as an initial belief when the true hidden state is unknown. In MOMDP problems, this represents maximum uncertainty about the hidden state while the visible state is assumed to be known.

The uniform belief is particularly useful for:

Initializing belief when no prior information is available
Baseline comparisons in experiments
Worst-case analysis of belief-dependent policies

Example

momdp = YourMOMDPProblem()
initial_belief = uniform_belief_y(momdp)
# or
updater = MOMDPDiscreteUpdater(momdp)
initial_belief = uniform_belief_y(updater)

source

Extended Functions

POMDPs.jl and POMDPTools.jl

POMDPs.states — Method

POMDPs.states(p::MOMDP)

Helper function to return the full state space for discrete MOMDPs. The states are Tuple{X,Y} where X is the visible state and Y is the hidden state.

source

POMDPs.stateindex — Method

POMDPs.stateindex(p::MOMDP{X,Y,A,O}, s::Tuple{X,Y}) where {X,Y,A,O}

Helper function to return the index of the Tuple{X,Y} state for discrete MOMDPs.

source

POMDPs.transition — Method

POMDPs.transition(p::MOMDP{X,Y,A,O}, s::Tuple{X,Y}, a::A) where {X,Y,A,O}

Helper function to return the full transition distribution for discrete MOMDPs. The states are Tuple{X,Y}. It uses transition_x and transition_y to construct the distribution.

source

POMDPs.initialstate — Method

initialstate(p::MOMDP{X,Y,A,O}) where {X,Y,A,O}

Helper function to return the initial state distribution for discrete MOMDPs. The states are Tuple{X,Y}. It uses initialstate_x and initialstate_y to construct the distribution.

source

POMDPs.observations — Method

POMDPs.observations(p::POMDP_of_Discrete_MOMDP)

Returns the full observation space of a POMDPofDiscrete_MOMDP. The observations are Tuple{X,O} where X is the visible state and O is the observation.

source

POMDPs.observation — Method

POMDPs.observation(p::POMDP_of_Discrete_MOMDP{X, Y, A, O}, a::A, s::Tuple{X, Y}) where {X, Y, A, O}

Returns the full observation distribution for a POMDPofDiscrete_MOMDP. The observations are Tuple{X,O} where X is the visible state and O is the observation.

source

POMDPs.obsindex — Method

POMDPs.obsindex(p::POMDP_of_Discrete_MOMDP, o)

Returns the index of the Tuple{X,O} observation for a POMDPofDiscrete_MOMDP.

source

POMDPs.value — Method

value(p::MOMDPAlphaVectorPolicy, b)

Calculate the value of belief state b for a MOMDP using alpha vectors.

This function computes the value by:

Marginalizing the belief over the partially observable state to get $b(x)$ for each fully observable state $x$
Computing the conditional belief $b(y \mid x) = b(x,y)/b(x)$ for each $x$
Finding the maximum dot product between $b(y \mid x)$ and the alpha vectors for each $x$
Summing $b(x) \cdot V(x, b(y \mid x))$ over all $x$ to get the total value

Arguments

p::MOMDPAlphaVectorPolicy: The alpha vector policy
b: The belief state over the joint state space (x,y)

Returns

The value of the belief state

Notes

This is not the most efficient way to get a value if we are operating in a true MOMDP framework. However, this keeps the structure of the code similar to the POMDPs.jl framework.

source

POMDPs.value — Method

value(p::MOMDPAlphaVectorPolicy, b, x)

Calculate the value for a specific visible state x given a belief b over the joint state space.

Arguments

p::MOMDPAlphaVectorPolicy: The alpha vector policy
b: The belief distribution over the joint state space (x,y)
x: The specific visible state to evaluate

Returns

The value for the given visible state and belief

Description

This function computes the value by:

Extracting the marginal belief over hidden states y for the given visible state x by evaluating pdf(b, (x,y)) for all y
Finding the maximum dot product between this marginal belief and all alpha vectors associated with visible state x

This is more efficient than the general value(p, b) function when the visible state is known, as it only needs to consider alpha vectors for the specific x rather than marginalizing over all visible states.

source

POMDPs.action — Method

action(p::MOMDPAlphaVectorPolicy, b)

Return the action prescribed by the MOMDP alpha-vector policy p for the belief b over the joint state space (x, y).

Heuristic

Find the visible state x with the largest probability mass in b.
Form the conditional distribution over y given that x.
Among the alpha-vectors in p.alphas[x], pick the one with the largest dot product with that conditional distribution.
Return the action associated with that alpha-vector.

Notes

When $b$ is not a pure distribution over a single $x$: Typically in a MOMDP, we assume $x$ (the "visible" state) is fully observed at runtime, so the belief over $(x, y)$ will place essentially all its probability mass on a single $x$. In that case, the above steps are effectively picking the single $x$ that we actually observed.
If you are operating in a true MOMDP framework, you can implement a custom action function that takes in the visible state $x$ and the conditional distribution over $y$ given $x$. While this would result in the same action as the heuristic above, it will be more efficient. E.g.:
```
# If ``x`` is known exactly at runtime and we have a distribution only over ``y``:
function action(p::MOMDPAlphaVectorPolicy, x, by)
    x_idx = stateindex_x(p.momdp, x)
    # pick the alpha-vector among p.alphas[x_idx] that maximizes dot(alpha, by)
    ...
end
```
In case your solver or simulator still gives a multi-modal distribution over different $x$ (which can happen in generic POMDP frameworks), the code here picks the $x$ that has the largest total probability mass in the belief. While this heuristic might be sufficient for some problems, we recommend implementing a custom action function that performs a one-step lookahead using the value function.

source

POMDPs.action — Method

action(p::MOMDPAlphaVectorPolicy, b, x)

Return the action prescribed by the MOMDP alpha-vector policy p for the belief b over the hidden states, given the known visible state x.

Arguments

p::MOMDPAlphaVectorPolicy: The alpha vector policy
b: The belief distribution over hidden states y
x: The known visible state

Returns

The action for the given visible state and belief over hidden states

Description

This function assumes the visible state x is fully observed and known. It:

Converts the belief b to a vector representation over hidden states
Among all alpha vectors associated with visible state x, finds the one with the maximum dot product with the belief vector
Returns the action associated with that optimal alpha vector

This is more efficient than the version without explicit x when the visible state is known at runtime, as it avoids the need to infer x from the belief distribution.

source

POMDPTools.Policies.actionvalues — Method

actionvalues(p::MOMDPAlphaVectorPolicy, b, x)

Compute the action values (Q-values) for all actions given a belief b over hidden states and a known visible state x.

Arguments

p::MOMDPAlphaVectorPolicy: The alpha vector policy
b: The belief distribution over hidden states y
x: The known visible state

Returns

A vector of action values where the ith element is the Q-value for action i

Description

This function performs a one-step lookahead to compute action values by:

For each action a and each hidden state y in the belief support:
- Computing the immediate reward R(x,y,a)
- For each possible next visible state x' and hidden state y':
  - For each possible observation o:
    - Updating the belief to get b'
    - Computing the value using the alpha vectors for x'
    - Accumulating the discounted expected future value
Summing over all transitions weighted by their probabilities

The resulting Q-values can be used for action selection or policy evaluation when the visible state is known.

source

POMDPs.initialize_belief — Method

initialize_belief(bu::MOMDPDiscreteUpdater, dist::Any)

Initialize a discrete belief over hidden states from a given distribution.

Arguments

bu::MOMDPDiscreteUpdater: The MOMDP discrete belief updater
dist: A distribution over hidden states to initialize from (supports various distribution types)

Returns

A DiscreteBelief over the hidden state space with probabilities initialized from dist

Description

This function creates a discrete belief representation over the hidden states y suitable for use with MOMDP belief update operations. The conversion process:

Creates a zero-initialized probability vector over all hidden states in the MOMDP
For each state y in the support of the input distribution, extracts the probability pdf(dist, y) and assigns it to the corresponding index in the belief vector
Returns a DiscreteBelief object that can be used with the MOMDP updater

Supported Distribution Types

The function can handle various distribution types through the generic pdf interface:

Discrete distributions (e.g., Categorical, DiscreteUniform)
Custom distributions that implement pdf and support
Sparse distributions with limited support

Usage Examples

updater = MOMDPDiscreteUpdater(momdp)

# From a uniform distribution
uniform_dist = DiscreteUniform(1, length(states_y(momdp)))
belief = initialize_belief(updater, uniform_dist)

# From a sparse categorical distribution  
sparse_dist = SparseCat([state1, state3], [0.7, 0.3])
belief = initialize_belief(updater, sparse_dist)

Implementation Notes

Uses stateindex_y to map hidden states to belief vector indices
Assumes the distribution is over individual hidden states, not joint (x,y) states
The resulting belief is properly normalized if the input distribution is normalized

source

POMDPs.update — Method

update(bu::MOMDPDiscreteUpdater, b::DiscreteBelief, a, o, x, xp)

Update a discrete belief over hidden states using the MOMDP belief update equation.

Arguments

bu::MOMDPDiscreteUpdater: The MOMDP discrete belief updater
b::DiscreteBelief: The current belief over hidden states y
a: The action taken
o: The observation received after taking action a
x: The previous visible state
xp: The current visible state

Returns

A new DiscreteBelief representing the updated belief over hidden states

Description

This function implements the discrete Bayesian filter for MOMDPs, which updates beliefs over hidden states given knowledge of visible state transitions.

The function iterates through all current hidden states with non-zero probability, computes transition probabilities to next hidden states, weights by observation probabilities, and normalizes the result.

Errors

Throws an error if the updated belief probabilities sum to zero, which indicates an impossible observation given the current belief and action.

source

POMDPs.update — Method

update(bu::MOMDPDiscreteUpdater, b::Any, a, o, x, xp)

This is a convenience method that handles arbitrary belief types by first calling initialize_belief to convert them to a DiscreteBelief, then performing the standard discrete belief update.

source

Internal Functions

MOMDPs.alphapairs — Function

Return an iterator of alpha vector-action pairs in the policy, given a visible state.

source

MOMDPs.alphavectors — Function

Return the alpha vectors, given a visible state.

source