POMDPs.jl
A Julia interface for defining, solving and simulating partially observable Markov decision processes and their fully observable counterparts.
Package and Ecosystem Features
- General interface that can handle problems with discrete and continuous state/action/observation spaces
- A number of popular state-of-the-art solvers implemented for use out-of-the-box
- Tools that make it easy to define problems and simulate solutions
- Simple integration of custom solvers into the existing interface
Available Packages
The POMDPs.jl package contains only the interface used for expressing and solving Markov decision processes (MDPs) and partially observable Markov decision processes (POMDPs). The POMDPTools package acts as a "standard library" for the POMDPs.jl interface, providing implementations of commonly-used components such as policies, belief updaters, distributions, and simulators. The list of solver and support packages maintained by the JuliaPOMDP community is available at the POMDPs.jl Readme.
Documentation Outline
The documentation is organized into three sections:
1. Examples and Tutorials–Practical, step-by-step demonstrations are provided in the Examples and the Gallery of POMDPs.jl Problems, illustrating common workflows and use cases.
2. API Reference–Complete technical documentation of all interface functions, types, and methods in POMDPs.jl can be found in the API Documentation.
3. Explanatory Guide–Conceptual explanations of POMDPs.jl, including how to define problems, solve them using provided solvers, and simulate results. See the detailed sections listed below.
- POMDPs.jl
- Installation
- Getting Started
- Concepts and Architecture
- Defining POMDPs and MDPs
- Spaces and Distributions
- Solvers
- Example: Defining an offline solver
- Example: Defining an online solver
- Defining a Belief Updater
- Simulation Standard
- Running Simulations
- Interacting with Policies
- Examples
- Defining a POMDP
- Using Different Solvers
- Simulations Examples
- GridWorld MDP Tutorial
- Dependencies
- Problem Overview
- Defining the Grid World MDP Type
- Grid World State Space
- Grid World Action Space
- Grid World Transition Function
- Grid World Reward Function
- Grid World Remaining Functions
- Solving the Grid World MDP (Value Iteration)
- Solving the Grid World MDP (MCTS)
- Visualizing the Value Iteration Policy
- Seeing a Policy In Action
- Gallery of POMDPs.jl Problems
- POMDPTools: the standard library for POMDPs.jl
- Implemented Distributions
- Model Tools
- Visualization
- Implemented Belief Updaters
- Implemented Policies
- Policy Evaluation
- Implemented Simulators
- Which Simulator Should I Use?
- I want to run fast rollout simulations and get the discounted reward.
- I want to evaluate performance with many parallel Monte Carlo simulations.
- I want to closely examine the histories of states, actions, etc. produced by simulations.
- I want to step through each individual step of a simulation.
- I want to visualize a simulation.
- I want to interact with a MDP or POMDP environment from the policy's perspective
- Stepping through
- Rollouts
- History Recorder
sim()
- Histories
- Parallel
- Display
- CommonRLInterface Integration
- Testing
- Frequently Asked Questions (FAQ)
- What is the difference between
transition
,gen
, and@gen
? - How do I save my policies?
- Why is my solver producing a suboptimal policy?
- What if I don't use the
rng
argument? - Why are all the solvers in separate modules?
- How can I implement terminal actions?
- Why are there two versions of
reward
? - How do I implement
reward(m, s, a)
if the reward depends on the next state? - Why do I need to put type assertions pomdp::POMDP into the function signature?
- What is the difference between
- API Documentation