CommonRLInterface Integration
The POMDPModelTools provides two-way integration with the CommonRLInterface.jl package. Using the convert
function, one can convert an MDP
or POMDP
object to a CommonRLInterface environment, or vice-versa.
For example,
using POMDPs
using POMDPModelTools
using POMDPModels
using CommonRLInterface
env = convert(AbstractEnv, BabyPOMDP())
r = act!(env, true)
observe(env)
converts a Crying Baby POMDP to an RL environment and acts in and observes the environment. This environment (or any other CommonRLInterface environment), can be converted to an MDP
or POMDP
:
using BasicPOMCP
m = convert(POMDP, env)
planner = solve(POMCPSolver(), m)
a = action(planner, initialstate(m))
You can also use the constructors listed below to manually convert between the interfaces.
Environment Wrapper Types
Since the standard reinforcement learning environment interface offers less information about the internal workings of the environment than the POMDPs.jl interface, MDPs and POMDPs created from these environments will have limited functionality. There are two types of (PO)MDP types that can wrap an environment:
Generative model wrappers
If the state
and setstate!
CommonRLInterface functions are provided, then the environment can be wrapped in a RLEnvMDP
or RLEnvPOMDP
and the POMDPs.jl generative model interface will be available.
Opaque wrappers
If the state
and setstate!
are not provided, then the resulting POMDP
or MDP
can only be simulated. This case is represented using the OpaqueRLEnvPOMDP
and OpaqueRLEnvMDP
wrappers. From the POMDPs.jl perspective, the state of the opaque (PO)MDP is just an integer wrapped in an OpaqueRLEnvState
. This keeps track of the "age" of the environment so that POMDPs.jl actions that attempt to interact with the environment at a different age are invalid.
Constructors
Creating RL environments from MDPs and POMDPs
POMDPModelTools.MDPCommonRLEnv
— TypeMDPCommonRLEnv(m, [s])
MDPCommonRLEnv{RLO}(m, [s])
Create a CommonRLInterface environment from MDP m; optionally specify the state 's'.
The RLO
parameter can be used to specify a type to convert the observation to. By default, this is AbstractArray
. Use Any
to disable conversion.
POMDPModelTools.POMDPCommonRLEnv
— TypePOMDPCommonRLEnv(m, [s], [o])
POMDPCommonRLEnv{RLO}(m, [s], [o])
Create a CommonRLInterface environment from POMDP m; optionally specify the state 's' and observation 'o'.
The RLO
parameter can be used to specify a type to convert the observation to. By default, this is AbstractArray
. Use Any
to disable conversion.
Creating MDPs and POMDPs from RL environments
POMDPModelTools.RLEnvMDP
— TypeRLEnvMDP(env; discount=1.0)
Create an MDP
by wrapping a CommonRLInterface.AbstractEnv
. state
and setstate!
from CommonRLInterface
must be provided, and the POMDPs
generative model functionality will be provided.
POMDPModelTools.RLEnvPOMDP
— TypeRLEnvPOMDP(env; discount=1.0)
Create an POMDP
by wrapping a CommonRLInterface.AbstractEnv
. state
and setstate!
from CommonRLInterface
must be provided, and the POMDPs
generative model functionality will be provided.
POMDPModelTools.OpaqueRLEnvMDP
— TypeOpaqueRLEnvMDP(env; discount=1.0)
Wrap a CommonRLInterface.AbstractEnv
in an MDP
object. The state will be an OpaqueRLEnvState
and only simulation will be supported.
POMDPModelTools.OpaqueRLEnvPOMDP
— TypeOpaqueRLEnvPOMDP(env; discount=1.0)
Wrap a CommonRLInterface.AbstractEnv
in an POMDP
object. The state will be an OpaqueRLEnvState
and only simulation will be supported.