A namespace for handling sampling and simulation. More...

Classes
class	DeterministicPolicy
	A deterministic policy that chooses actions according to the provided action index. More...

class	ModelSimulator
	A simulator that behaves as the provided MDP. More...

class	RandomizedPolicy
	A randomized policy that chooses actions according to the provided vector of probabilities. More...

class	RandomPolicy
	A random policy with state-dependent action sets which are discrete. More...

class	Sample
	Represents a single transition between two states after taking an action: \[ (s, a, s', r, w) \] where: More...

class	SampleDiscretizerSD
	Turns arbitrary samples to discrete ones (with continuous numbers assigned to states) assuming that actions are state dependent. More...

class	SampleDiscretizerSI
	Turns arbitrary samples to discrete ones assuming that actions are state independent. More...

class	SampledMDP
	Constructs an MDP from integer samples. More...

class	Samples
	General representation of samples: \[ \Sigma = (s_i, a_i, s_i', r_i, w_i)_{i=0}^{m-1} \] See Sample for definitions of individual values. More...

Typedefs
using	DiscreteSamples = Samples< long, long >
	Samples in which the states and actions are identified by integers. More...

using	DiscreteSample = Sample< long, long >
	Integral expectation sample.

using	ModelRandomPolicy = RandomPolicy< ModelSimulator >
	Random (uniformly) policy to be used with the model simulator.

using	ModelRandomizedPolicy = RandomizedPolicy< ModelSimulator >
	Randomized policy to be used with MDP model simulator. More...

using	ModelDeterministicPolicy = DeterministicPolicy< ModelSimulator >
	Deterministic policy to be used with MDP model simulator.

Functions
template<class Sim , class... U>
Samples< typename Sim::State, typename Sim::Action >	make_samples (U &&... u)
	A helper function that constructs a samples object based on the simulator that is provided to it.

template<class Sim , class SampleType = Samples<typename Sim::State, typename Sim::Action>>
void	simulate (Sim &sim, SampleType &samples, const function< typename Sim::Action(typename Sim::State &)> &policy, long horizon, long runs, long tran_limit=-1, prec_t prob_term=0.0, random_device::result_type seed=random_device{}())
	More...

template<class Sim , class SampleType = Samples<typename Sim::State, typename Sim::Action>>
SampleType	simulate (Sim &sim, const function< typename Sim::Action(typename Sim::State &)> &policy, long horizon, long runs, long tran_limit=-1, prec_t prob_term=0.0, random_device::result_type seed=random_device{}())
	Runs the simulator and generates samples. More...

template<class Sim >
pair< vector< typename Sim::State >, numvec >	simulate_return (Sim &sim, prec_t discount, const function< typename Sim::Action(typename Sim::State &)> &policy, long horizon, long runs, prec_t prob_term=0.0, random_device::result_type seed=random_device{}())
	Runs the simulator and computer the returns from the simulation. More...

Detailed Description

A namespace for handling sampling and simulation.

Typedef Documentation

◆ DiscreteSamples

using craam::msen::DiscreteSamples = typedef Samples<long,long>

Samples in which the states and actions are identified by integers.

◆ ModelRandomizedPolicy

using craam::msen::ModelRandomizedPolicy = typedef RandomizedPolicy<ModelSimulator>

Randomized policy to be used with MDP model simulator.

In order to have a determinstic outcome of a simulation, one needs to set also the seed of simulate and ModelSimulator.

Function Documentation

◆ simulate() [1/2]

template<class Sim , class SampleType = Samples<typename Sim::State, typename Sim::Action>>

void craam::msen::simulate	(	Sim &	sim,
		SampleType &	samples,
		const function< typename Sim::Action(typename Sim::State &)> &	policy,
		long	horizon,
		long	runs,
		long	tran_limit = `-1`,
		prec_t	prob_term = `0.0`,
		random_device::result_type	seed = `random_device{}()`
	)

Runs the simulator and generates samples.

This method assumes that the simulator can start simulation in any state. There may be an internal state, however, which is independent of the transitions; for example this may be the internal state of the random number generator.

States and actions are passed by value everywhere (moved when appropriate) and therefore it is important that they are lightweight objects.

A simulator should have the following methods:

class Simulator{
public:
    typedef state_type State;
    
    typedef action_type Action;
    State init_state();
    
    pair<double,State> transition(State, Action);
    bool end_condition(State) const;
    long action_count(State) const;
    Action action(State, index) const;
}

Template Parameters

Sim	Simulator class used in the simulation. See the main description for the methods that the simulator must provide.
SampleType	Class used to hold the samples.

Parameters

sim	Simulator that holds the properties needed by the simulator
samples	Add the result of the simulation to this object
policy	Policy function
horizon	Number of steps
prob_term	The probability of termination in each step

◆ simulate() [2/2]

template<class Sim , class SampleType = Samples<typename Sim::State, typename Sim::Action>>

SampleType craam::msen::simulate	(	Sim &	sim,
		const function< typename Sim::Action(typename Sim::State &)> &	policy,
		long	horizon,
		long	runs,
		long	tran_limit = `-1`,
		prec_t	prob_term = `0.0`,
		random_device::result_type	seed = `random_device{}()`
	)

Runs the simulator and generates samples.

See the overloaded version of the method for more details. This variant constructs and returns the samples object.

Returns: Set of samples

◆ simulate_return()

template<class Sim >

pair<vector<typename Sim::State>, numvec> craam::msen::simulate_return	(	Sim &	sim,
		prec_t	discount,
		const function< typename Sim::Action(typename Sim::State &)> &	policy,
		long	horizon,
		long	runs,
		prec_t	prob_term = `0.0`,
		random_device::result_type	seed = `random_device{}()`
	)

Runs the simulator and computer the returns from the simulation.