CommonRLInterface Integration

The POMDPModelTools provides two-way integration with the CommonRLInterface.jl package. Using the convert function, one can convert an MDP or POMDP object to a CommonRLInterface environment, or vice-versa.

For example,

using POMDPs
using POMDPModelTools
using POMDPModels
using CommonRLInterface

env = convert(AbstractEnv, BabyPOMDP())

r = act!(env, true)
observe(env)

converts a Crying Baby POMDP to an RL environment and acts in and observes the environment. This environment (or any other CommonRLInterface environment), can be converted to an MDP or POMDP:

using BasicPOMCP

m = convert(POMDP, env)
planner = solve(POMCPSolver(), m)
a = action(planner, initialstate(m))

You can also use the constructors listed below to manually convert between the interfaces.

Environment Wrapper Types

Since the standard reinforcement learning environment interface offers less information about the internal workings of the environment than the POMDPs.jl interface, MDPs and POMDPs created from these environments will have limited functionality. There are two types of (PO)MDP types that can wrap an environment:

Generative model wrappers

If the state and setstate! CommonRLInterface functions are provided, then the environment can be wrapped in a RLEnvMDP or RLEnvPOMDP and the POMDPs.jl generative model interface will be available.

Opaque wrappers

If the state and setstate! are not provided, then the resulting POMDP or MDP can only be simulated. This case is represented using the OpaqueRLEnvPOMDP and OpaqueRLEnvMDP wrappers. From the POMDPs.jl perspective, the state of the opaque (PO)MDP is just an integer wrapped in an OpaqueRLEnvState. This keeps track of the "age" of the environment so that POMDPs.jl actions that attempt to interact with the environment at a different age are invalid.

Constructors

Creating RL environments from MDPs and POMDPs

POMDPModelTools.MDPCommonRLEnv — Type

MDPCommonRLEnv(m, [s])
MDPCommonRLEnv{RLO}(m, [s])

Create a CommonRLInterface environment from MDP m; optionally specify the state 's'.

The RLO parameter can be used to specify a type to convert the observation to. By default, this is AbstractArray. Use Any to disable conversion.

source

POMDPModelTools.POMDPCommonRLEnv — Type

POMDPCommonRLEnv(m, [s], [o])
POMDPCommonRLEnv{RLO}(m, [s], [o])

Create a CommonRLInterface environment from POMDP m; optionally specify the state 's' and observation 'o'.

The RLO parameter can be used to specify a type to convert the observation to. By default, this is AbstractArray. Use Any to disable conversion.

source

Creating MDPs and POMDPs from RL environments

POMDPModelTools.RLEnvMDP — Type

RLEnvMDP(env; discount=1.0)

Create an MDP by wrapping a CommonRLInterface.AbstractEnv. state and setstate! from CommonRLInterface must be provided, and the POMDPs generative model functionality will be provided.

source

POMDPModelTools.RLEnvPOMDP — Type

RLEnvPOMDP(env; discount=1.0)

Create an POMDP by wrapping a CommonRLInterface.AbstractEnv. state and setstate! from CommonRLInterface must be provided, and the POMDPs generative model functionality will be provided.

source

POMDPModelTools.OpaqueRLEnvMDP — Type

OpaqueRLEnvMDP(env; discount=1.0)

Wrap a CommonRLInterface.AbstractEnv in an MDP object. The state will be an OpaqueRLEnvState and only simulation will be supported.

source

POMDPModelTools.OpaqueRLEnvPOMDP — Type

OpaqueRLEnvPOMDP(env; discount=1.0)

Wrap a CommonRLInterface.AbstractEnv in an POMDP object. The state will be an OpaqueRLEnvState and only simulation will be supported.

source