Skip to content

Commit 46982ec

Browse files
authored
finish updating isd->b0, link to CommonRLInterface
1 parent 7d7f92f commit 46982ec

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

docs/src/simulation.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Simulation Standard
22

3-
Important note: In most cases, **users need not implement their own simulators**. Several simulators that are compatible with the standard in this document are implemented in the [POMDPSimulators package](https://github.com/JuliaPOMDP/POMDPSimulators.jl) and allow [interaction from a variety of perspectives](https://juliapomdp.github.io/POMDPSimulators.jl/latest/which/). Moreover [RLInterface.jl](https://github.com/JuliaPOMDP/RLInterface.jl) provides an OpenAI Gym style environment interface to interact with environments that is more flexible in some cases.
3+
Important note: In most cases, **users need not implement their own simulators**. Several simulators that are compatible with the standard in this document are implemented in the [POMDPSimulators package](https://github.com/JuliaPOMDP/POMDPSimulators.jl) and allow [interaction from a variety of perspectives](https://juliapomdp.github.io/POMDPSimulators.jl/latest/which/). Moreover [CommonRLInterface.jl](https://github.com/JuliaReinforcementLearning/CommonRLInterface.jl) provides an OpenAI Gym style environment interface to interact with environments that is more flexible in some cases.
44

55
In order to maintain consistency across the POMDPs.jl ecosystem, this page defines a standard for how simulations should be conducted. All simulators should be consistent with this page, and, if solvers are attempting to find an optimal POMDP policy, they should optimize the expected value of `r_total` below. In particular, this page should be consulted when questions about how less-obvious concepts like terminal states are handled.
66

@@ -13,7 +13,7 @@ In general, POMDP simulations take up to 5 inputs (see also the [`simulate`](@re
1313
- `pomdp::POMDP`: pomdp model object (see [POMDPs and MDPs](@ref))
1414
- `policy::Policy`: policy (see [Solvers and Policies](@ref))
1515
- `up::Updater`: belief updater (see [Beliefs and Updaters](@ref))
16-
- `b0`: initial belief (this may be )
16+
- `b0`: initial belief (this may be updater-specific, such as an observation if the updater just returns the previous observation)
1717
- `s`: initial state
1818

1919
The last three of these inputs are optional. If they are not explicitly provided, they should be inferred using the following POMDPs.jl functions:

0 commit comments

Comments
 (0)