Represents sparse transition probabilities and rewards from a single state. More...

#include <Transition.hpp>

Public Member Functions
	Transition (const indvec &indices, const numvec &probabilities, const numvec &rewards)
	Creates a single transition from raw data. More...

	Transition (const indvec &indices, const numvec &probabilities)
	Creates a single transition from raw data with uniformly zero rewards. More...

	Transition (const numvec &probabilities)
	Creates a single transition from raw data with uniformly zero rewards, where destination states are indexed automatically starting with 0. More...

void	add_sample (long stateid, prec_t probability, prec_t reward)
	Adds a single transitions probability to the existing probabilities. More...

prec_t	sum_probabilities () const

void	normalize ()
	Normalizes the transition probabilities to sum to 1. More...

bool	is_normalized () const

prec_t	value (numvec const &valuefunction, prec_t discount, numvec probabilities) const
	Computes value for the transition and a value function. More...

prec_t	value (numvec const &valuefunction, prec_t discount=1.0) const
	Computes value for the transition and a value function. More...

prec_t	mean_reward (const numvec &probabilities) const
	Computes the mean return from this transition with custom transition probabilities.

prec_t	mean_reward () const
	Computes the mean return from this transition.

size_t	size () const
	Returns the number of target states with non-zero transition probabilities. More...

bool	empty () const
	Checks if the transition is empty. More...

long	max_index () const
	Returns the maximal indexes involved in the transition. More...

void	probabilities_addto (prec_t scale, numvec &transition) const
	Scales transition probabilities according to the provided parameter and adds them to the provided vector. More...

void	probabilities_addto (prec_t scale, Transition &transition) const
	Scales transition probabilities and rewards according to the provided parameter and adds them to the provided vector. More...

numvec	probabilities_vector (size_t size) const
	Constructs and returns a dense vector of probabilities, which includes 0 transition probabilities. More...

numvec	rewards_vector (size_t size) const
	Constructs and returns a dense vector of rewards, which includes 0 transition probabilities. More...

const indvec &	get_indices () const
	Indices with positive probabilities. More...

long	get_index (long k)
	Index of the k-th state with non-zero probability.

const numvec &	get_probabilities () const
	Returns list of positive probabilities for indexes returned by get_indices. More...

const numvec &	get_rewards () const
	Rewards for indices with positive probabilities returned by get_indices. More...

void	set_reward (long sampleid, prec_t reward)
	Sets the reward for a transition to a particular state.

prec_t	get_reward (long sampleid) const
	Gets the reward for a transition to a particular state.

string	to_json (long outcomeid=-1) const
	Returns a json representation of transition probabilities. More...

Protected Attributes
indvec	indices
	List of state indices.

numvec	probabilities
	List of probability distributions to states.

numvec	rewards
	List of rewards associated with transitions.

Detailed Description

Represents sparse transition probabilities and rewards from a single state.

The class can be also used to represent a generic sparse distribution.

The destination indexes are sorted increasingly (as added). This makes it simpler to aggregate multiple transition probabilities and should also make value iteration more cache friendly. However, transitions need to be added with increasing IDs to prevent excessive performance degradation.

Constructor & Destructor Documentation

◆ Transition() [1/3]

craam::Transition::Transition	(	const indvec &	indices,
		const numvec &	probabilities,
		const numvec &	rewards
	)

inline

Creates a single transition from raw data.

Because the transition indexes are stored increasingly sorted, this method must sort (and aggregate duplicate) the indices.

Parameters

indices	The indexes of states to transition to
probabilities	The probabilities of transitions
rewards	The associated rewards with each transition

◆ Transition() [2/3]

craam::Transition::Transition	(	const indvec &	indices,
		const numvec &	probabilities
	)

inline

Creates a single transition from raw data with uniformly zero rewards.

Because the transition indexes are stored increasingly sorted, this method must sort (and aggregate duplicate) the indices.

Parameters

indices	The indexes of states to transition to
probabilities	The probabilities of transitions

◆ Transition() [3/3]

craam::Transition::Transition ( const numvec & probabilities )

inline

Creates a single transition from raw data with uniformly zero rewards, where destination states are indexed automatically starting with 0.

Parameters

probabilities The probabilities of transitions; indexes are implicit.

Member Function Documentation

◆ add_sample()

void craam::Transition::add_sample	(	long	stateid,
		prec_t	probability,
		prec_t	reward
	)

inline

Adds a single transitions probability to the existing probabilities.

If the transition to a state does not exist, then it is simply added to the list. If the transition to the desired state already exists, then the transition probability is added and the reward is updated as a weighted combination. Let \( p(s) \) and \( r(s) \) be the current transition probability and reward respectively. The updated transition probability and reward are:

Probability:
\[ p'(s) = p(s) + p \]
Reward:
\[ r'(s) = \frac{p(s) \, r(s) + p \, r}{p'(s)} \]
Here, \( p \) is the argument probability and \( r \) is the argument reward.

When the function is called multiple times with \( p_1 \ldots p_n \) and \( r_1 \ldots r_n \) for a single \( s \) then:

Probability:
\[ p'(s) = \sum_{i=1}^{n} p_i \]
Reward:
\[ r'(s) = \frac{ \sum_{i=1}^{n} p_i \, r_i}{p'(s)} \]

Transition probabilities are not checked to sum to one.

Parameters

stateid	ID of the target state
probability	Probability of transitioning to this state
reward	The reward associated with the transition

◆ empty()

bool craam::Transition::empty ( ) const

inline

Checks if the transition is empty.

◆ get_indices()

const indvec& craam::Transition::get_indices ( ) const

inline

Indices with positive probabilities.

◆ get_probabilities()

const numvec& craam::Transition::get_probabilities ( ) const

inline

Returns list of positive probabilities for indexes returned by get_indices.

◆ get_rewards()

const numvec& craam::Transition::get_rewards ( ) const

inline

Rewards for indices with positive probabilities returned by get_indices.

◆ is_normalized()

bool craam::Transition::is_normalized ( ) const

inline

Returns: Whether the transition probabilities sum to 1.

◆ max_index()

long craam::Transition::max_index ( ) const

inline

Returns the maximal indexes involved in the transition.

Returns -1 for and empty transition.

◆ normalize()

void craam::Transition::normalize ( )

inline

Normalizes the transition probabilities to sum to 1.

Exception is thrown if the distribution sums to 0.

◆ probabilities_addto() [1/2]

void craam::Transition::probabilities_addto	(	prec_t	scale,
		numvec &	transition
	)		const

inline

Scales transition probabilities according to the provided parameter and adds them to the provided vector.

This method ignores rewards.

Parameters

scale	Multiplicative modification of transition probabilities
transition	Transition probabilities being added to. This value is modified within the function.

◆ probabilities_addto() [2/2]

void craam::Transition::probabilities_addto	(	prec_t	scale,
		Transition &	transition
	)		const

inline

Scales transition probabilities and rewards according to the provided parameter and adds them to the provided vector.

Parameters

scale	Multiplicative modification of transition probabilities
transition	Transition probabilities being added to. This value is modified within the function.

◆ probabilities_vector()

numvec craam::Transition::probabilities_vector ( size_t size ) const

inline

Constructs and returns a dense vector of probabilities, which includes 0 transition probabilities.

Parameters

size	Size of the constructed vector

◆ rewards_vector()

numvec craam::Transition::rewards_vector ( size_t size ) const

inline

Constructs and returns a dense vector of rewards, which includes 0 transition probabilities.

Rewards for indices with zero transition probability are zero.

Parameters

size	Size of the constructed vector

◆ size()

size_t craam::Transition::size ( ) const

inline

Returns the number of target states with non-zero transition probabilities.

◆ to_json()

string craam::Transition::to_json ( long outcomeid = -1 ) const

inline

Returns a json representation of transition probabilities.

Parameters

outcomeid Includes also outcome id

◆ value() [1/2]

prec_t craam::Transition::value	(	numvec const &	valuefunction,
		prec_t	discount,
		numvec	probabilities
	)		const

inline

Computes value for the transition and a value function.

When there are no target states, the function terminates with an error.

Parameters

valuefunction	Value function, or an arbitrary vector of values
discount	Discount factor, optional (default value 1)
probabilities	Custom probability distribution. It must be of the same length as the number of nonzero transition probabilities. The length is NOT checked in a release build.

◆ value() [2/2]

prec_t craam::Transition::value	(	numvec const &	valuefunction,
		prec_t	discount = `1.0`
	)		const

inline

Computes value for the transition and a value function.

When there are no target states, the function terminates with an error.

Parameters

valuefunction	Value function, or an arbitrary vector of values
discount	Discount factor, optional (default value 1)

The documentation for this class was generated from the following file:

craam/Transition.hpp

Public Member Functions

Protected Attributes

Detailed Description

Constructor & Destructor Documentation

◆ Transition() [1/3]

◆ Transition() [2/3]

◆ Transition() [3/3]

Member Function Documentation

◆ add_sample()

◆ empty()

◆ get_indices()

◆ get_probabilities()

◆ get_rewards()

◆ is_normalized()

◆ max_index()

◆ normalize()

◆ probabilities_addto() [1/2]

◆ probabilities_addto() [2/2]

◆ probabilities_vector()

◆ rewards_vector()

◆ size()

◆ to_json()

◆ value() [1/2]

◆ value() [2/2]