CRAAM
2.0.0
Robust and Approximate Markov Decision Processes
|
A namespace for handling sampling and simulation. More...
Classes | |
class | DeterministicPolicy |
A deterministic policy that chooses actions according to the provided action index. More... | |
class | ModelSimulator |
A simulator that behaves as the provided MDP. More... | |
class | RandomizedPolicy |
A randomized policy that chooses actions according to the provided vector of probabilities. More... | |
class | RandomPolicy |
A random policy with state-dependent action sets which are discrete. More... | |
class | Sample |
Represents a single transition between two states after taking an action: \[ (s, a, s', r, w) \] where: More... | |
class | SampleDiscretizerSD |
Turns arbitrary samples to discrete ones (with continuous numbers assigned to states) assuming that actions are state dependent. More... | |
class | SampleDiscretizerSI |
Turns arbitrary samples to discrete ones assuming that actions are state independent. More... | |
class | SampledMDP |
Constructs an MDP from integer samples. More... | |
class | Samples |
General representation of samples: \[ \Sigma = (s_i, a_i, s_i', r_i, w_i)_{i=0}^{m-1} \] | |
Typedefs | |
using | DiscreteSamples = Samples< long, long > |
Samples in which the states and actions are identified by integers. More... | |
using | DiscreteSample = Sample< long, long > |
Integral expectation sample. | |
using | ModelRandomPolicy = RandomPolicy< ModelSimulator > |
Random (uniformly) policy to be used with the model simulator. | |
using | ModelRandomizedPolicy = RandomizedPolicy< ModelSimulator > |
Randomized policy to be used with MDP model simulator. More... | |
using | ModelDeterministicPolicy = DeterministicPolicy< ModelSimulator > |
Deterministic policy to be used with MDP model simulator. | |
Functions | |
template<class Sim , class... U> | |
Samples< typename Sim::State, typename Sim::Action > | make_samples (U &&... u) |
A helper function that constructs a samples object based on the simulator that is provided to it. | |
template<class Sim , class SampleType = Samples<typename Sim::State, typename Sim::Action>> | |
void | simulate (Sim &sim, SampleType &samples, const function< typename Sim::Action(typename Sim::State &)> &policy, long horizon, long runs, long tran_limit=-1, prec_t prob_term=0.0, random_device::result_type seed=random_device{}()) |
More... | |
template<class Sim , class SampleType = Samples<typename Sim::State, typename Sim::Action>> | |
SampleType | simulate (Sim &sim, const function< typename Sim::Action(typename Sim::State &)> &policy, long horizon, long runs, long tran_limit=-1, prec_t prob_term=0.0, random_device::result_type seed=random_device{}()) |
Runs the simulator and generates samples. More... | |
template<class Sim > | |
pair< vector< typename Sim::State >, numvec > | simulate_return (Sim &sim, prec_t discount, const function< typename Sim::Action(typename Sim::State &)> &policy, long horizon, long runs, prec_t prob_term=0.0, random_device::result_type seed=random_device{}()) |
Runs the simulator and computer the returns from the simulation. More... | |
A namespace for handling sampling and simulation.
using craam::msen::DiscreteSamples = typedef Samples<long,long> |
Samples in which the states and actions are identified by integers.
using craam::msen::ModelRandomizedPolicy = typedef RandomizedPolicy<ModelSimulator> |
Randomized policy to be used with MDP model simulator.
In order to have a determinstic outcome of a simulation, one needs to set also the seed of simulate and ModelSimulator.
void craam::msen::simulate | ( | Sim & | sim, |
SampleType & | samples, | ||
const function< typename Sim::Action(typename Sim::State &)> & | policy, | ||
long | horizon, | ||
long | runs, | ||
long | tran_limit = -1 , |
||
prec_t | prob_term = 0.0 , |
||
random_device::result_type | seed = random_device{}() |
||
) |
Runs the simulator and generates samples.
This method assumes that the simulator can start simulation in any state. There may be an internal state, however, which is independent of the transitions; for example this may be the internal state of the random number generator.
States and actions are passed by value everywhere (moved when appropriate) and therefore it is important that they are lightweight objects.
A simulator should have the following methods:
Sim | Simulator class used in the simulation. See the main description for the methods that the simulator must provide. |
SampleType | Class used to hold the samples. |
sim | Simulator that holds the properties needed by the simulator |
samples | Add the result of the simulation to this object |
policy | Policy function |
horizon | Number of steps |
prob_term | The probability of termination in each step |
SampleType craam::msen::simulate | ( | Sim & | sim, |
const function< typename Sim::Action(typename Sim::State &)> & | policy, | ||
long | horizon, | ||
long | runs, | ||
long | tran_limit = -1 , |
||
prec_t | prob_term = 0.0 , |
||
random_device::result_type | seed = random_device{}() |
||
) |
Runs the simulator and generates samples.
See the overloaded version of the method for more details. This variant constructs and returns the samples object.
pair<vector<typename Sim::State>, numvec> craam::msen::simulate_return | ( | Sim & | sim, |
prec_t | discount, | ||
const function< typename Sim::Action(typename Sim::State &)> & | policy, | ||
long | horizon, | ||
long | runs, | ||
prec_t | prob_term = 0.0 , |
||
random_device::result_type | seed = random_device{}() |
||
) |
Runs the simulator and computer the returns from the simulation.
This method assumes that the simulator can start simulation in any state. There may be an internal state, however, which is independent of the transitions; for example this may be the internal state of the random number generator.
States and actions are passed by value everywhere (moved when appropriate) and therefore it is important that they are lightweight objects.
Sim | Simulator class used in the simulation. See the main description for the methods that the simulator must provide. |
sim | Simulator that holds the properties needed by the simulator |
discount | Discount to use in the computation |
policy | Policy function |
horizon | Number of steps |
prob_term | The probability of termination in each step |