CRAAM
2.0.0
Robust and Approximate Markov Decision Processes
|
A simulator that behaves as the provided MDP. More...
#include <Simulation.hpp>
Public Types | |
typedef long | State |
Type of states. | |
typedef long | Action |
Type of actions. | |
Public Member Functions | |
ModelSimulator (const shared_ptr< const MDP > &mdp, const Transition &initial, random_device::result_type seed=random_device{}()) | |
Build a model simulator and share and MDP. More... | |
ModelSimulator (const shared_ptr< MDP > &mdp, const Transition &initial, random_device::result_type seed=random_device{}()) | |
Build a model simulator and share and MDP. More... | |
State | init_state () |
Returns a sample from the initial states. | |
pair< double, State > | transition (State state, Action action) |
Returns a sample of the reward and a decision state following a state. More... | |
bool | end_condition (State s) const |
Checks whether the decision state is terminal. More... | |
size_t | action_count (State state) const |
State dependent action list. | |
Action | action (State, long index) const |
Returns an action with the given index. | |
Protected Attributes | |
default_random_engine | gen |
Random number engine. | |
shared_ptr< const MDP > | mdp |
MDP used for the simulation. | |
Transition | initial |
Initial distribution. | |
A simulator that behaves as the provided MDP.
A state of MDP.size() is considered to be the terminal state.
If the sum of all transitions is less than 1, then the remainder is assumed to be the probability of transitioning to the terminal state.
Any state with an index higher or equal to the number of states is considered to be terminal.
|
inline |
Build a model simulator and share and MDP.
The initial transition is copied internally to the object, while the MDP object is stored internally.
|
inline |
Build a model simulator and share and MDP.
The initial transition is copied internally to the object, while the MDP object is stored internally.
|
inline |
Checks whether the decision state is terminal.
A state is assumed to be terminal when:
Returns a sample of the reward and a decision state following a state.
If the transition probabilities do not sum to 1, then them remainder is considered as a probability of transitioning to a terminal state (one with an index that is too large; see ModelSimulator::end_condition)
state | Current state |
action | Current action |