Represents a single transition between two states after taking an action:

\[ (s, a, s', r, w) \]

#include <Samples.hpp>

Public Member Functions
	Sample (State state_from, Action action, State state_to, prec_t reward, prec_t weight, long step, long run)

State	state_from () const
	Original state.

Action	action () const
	Action taken.

State	state_to () const
	Destination state.

prec_t	reward () const
	Reward associated with the sample.

prec_t	weight () const
	Sample weight.

long	step () const
	Number of the step in an one execution of the simulation.

long	run () const
	Number of the actual execution.

Protected Attributes
State	_state_from
	Original state.

Action	_action
	Action taken.

State	_state_to
	Destination state.

prec_t	_reward
	Reward associated with the sample.

prec_t	_weight
	Sample weight.

long	_step
	Number of the step in an one execution of the simulation.

long	_run
	Number of the actual execution.

Detailed Description

Represents a single transition between two states after taking an action:

\[ (s, a, s', r, w) \]

where:

\( s \) is the originating state
\( a \) is the action taken
\( s' \) is the target state
\( r \) is the reward
\( w \) is the weight (or importance) of the state. It is like probability, except it does not have to sum to 1. It must be non-negative.

In addition, the sample also includes step and the run. These are used for example to compute the return from samples.

Template Parameters

State	MDP state: \( s, s'\)
Action	MDP action: \( a \)

The documentation for this class was generated from the following file:

Public Member Functions