CRAAM  2.0.0
Robust and Approximate Markov Decision Processes
Public Member Functions | Protected Attributes | List of all members
craam::Transition Class Reference

Represents sparse transition probabilities and rewards from a single state. More...

#include <Transition.hpp>

Public Member Functions

 Transition (const indvec &indices, const numvec &probabilities, const numvec &rewards)
 Creates a single transition from raw data. More...
 
 Transition (const indvec &indices, const numvec &probabilities)
 Creates a single transition from raw data with uniformly zero rewards. More...
 
 Transition (const numvec &probabilities)
 Creates a single transition from raw data with uniformly zero rewards, where destination states are indexed automatically starting with 0. More...
 
void add_sample (long stateid, prec_t probability, prec_t reward)
 Adds a single transitions probability to the existing probabilities. More...
 
prec_t sum_probabilities () const
 
void normalize ()
 Normalizes the transition probabilities to sum to 1. More...
 
bool is_normalized () const
 
prec_t value (numvec const &valuefunction, prec_t discount, numvec probabilities) const
 Computes value for the transition and a value function. More...
 
prec_t value (numvec const &valuefunction, prec_t discount=1.0) const
 Computes value for the transition and a value function. More...
 
prec_t mean_reward (const numvec &probabilities) const
 Computes the mean return from this transition with custom transition probabilities.
 
prec_t mean_reward () const
 Computes the mean return from this transition.
 
size_t size () const
 Returns the number of target states with non-zero transition probabilities. More...
 
bool empty () const
 Checks if the transition is empty. More...
 
long max_index () const
 Returns the maximal indexes involved in the transition. More...
 
void probabilities_addto (prec_t scale, numvec &transition) const
 Scales transition probabilities according to the provided parameter and adds them to the provided vector. More...
 
void probabilities_addto (prec_t scale, Transition &transition) const
 Scales transition probabilities and rewards according to the provided parameter and adds them to the provided vector. More...
 
numvec probabilities_vector (size_t size) const
 Constructs and returns a dense vector of probabilities, which includes 0 transition probabilities. More...
 
numvec rewards_vector (size_t size) const
 Constructs and returns a dense vector of rewards, which includes 0 transition probabilities. More...
 
const indvecget_indices () const
 Indices with positive probabilities. More...
 
long get_index (long k)
 Index of the k-th state with non-zero probability.
 
const numvecget_probabilities () const
 Returns list of positive probabilities for indexes returned by get_indices. More...
 
const numvecget_rewards () const
 Rewards for indices with positive probabilities returned by get_indices. More...
 
void set_reward (long sampleid, prec_t reward)
 Sets the reward for a transition to a particular state.
 
prec_t get_reward (long sampleid) const
 Gets the reward for a transition to a particular state.
 
string to_json (long outcomeid=-1) const
 Returns a json representation of transition probabilities. More...
 

Protected Attributes

indvec indices
 List of state indices.
 
numvec probabilities
 List of probability distributions to states.
 
numvec rewards
 List of rewards associated with transitions.
 

Detailed Description

Represents sparse transition probabilities and rewards from a single state.

The class can be also used to represent a generic sparse distribution.

The destination indexes are sorted increasingly (as added). This makes it simpler to aggregate multiple transition probabilities and should also make value iteration more cache friendly. However, transitions need to be added with increasing IDs to prevent excessive performance degradation.

Constructor & Destructor Documentation

◆ Transition() [1/3]

craam::Transition::Transition ( const indvec indices,
const numvec probabilities,
const numvec rewards 
)
inline

Creates a single transition from raw data.

Because the transition indexes are stored increasingly sorted, this method must sort (and aggregate duplicate) the indices.

Parameters
indicesThe indexes of states to transition to
probabilitiesThe probabilities of transitions
rewardsThe associated rewards with each transition

◆ Transition() [2/3]

craam::Transition::Transition ( const indvec indices,
const numvec probabilities 
)
inline

Creates a single transition from raw data with uniformly zero rewards.

Because the transition indexes are stored increasingly sorted, this method must sort (and aggregate duplicate) the indices.

Parameters
indicesThe indexes of states to transition to
probabilitiesThe probabilities of transitions

◆ Transition() [3/3]

craam::Transition::Transition ( const numvec probabilities)
inline

Creates a single transition from raw data with uniformly zero rewards, where destination states are indexed automatically starting with 0.

Parameters
probabilitiesThe probabilities of transitions; indexes are implicit.

Member Function Documentation

◆ add_sample()

void craam::Transition::add_sample ( long  stateid,
prec_t  probability,
prec_t  reward 
)
inline

Adds a single transitions probability to the existing probabilities.

If the transition to a state does not exist, then it is simply added to the list. If the transition to the desired state already exists, then the transition probability is added and the reward is updated as a weighted combination. Let \( p(s) \) and \( r(s) \) be the current transition probability and reward respectively. The updated transition probability and reward are:

  • Probability:

    \[ p'(s) = p(s) + p \]

  • Reward:

    \[ r'(s) = \frac{p(s) \, r(s) + p \, r}{p'(s)} \]

    Here, \( p \) is the argument probability and \( r \) is the argument reward.

When the function is called multiple times with \( p_1 \ldots p_n \) and \( r_1 \ldots r_n \) for a single \( s \) then:

  • Probability:

    \[ p'(s) = \sum_{i=1}^{n} p_i \]

  • Reward:

    \[ r'(s) = \frac{ \sum_{i=1}^{n} p_i \, r_i}{p'(s)} \]

Transition probabilities are not checked to sum to one.

Parameters
stateidID of the target state
probabilityProbability of transitioning to this state
rewardThe reward associated with the transition

◆ empty()

bool craam::Transition::empty ( ) const
inline

Checks if the transition is empty.

◆ get_indices()

const indvec& craam::Transition::get_indices ( ) const
inline

Indices with positive probabilities.

◆ get_probabilities()

const numvec& craam::Transition::get_probabilities ( ) const
inline

Returns list of positive probabilities for indexes returned by get_indices.

See also probabilities_vector.

◆ get_rewards()

const numvec& craam::Transition::get_rewards ( ) const
inline

Rewards for indices with positive probabilities returned by get_indices.

See also rewards_vector.

◆ is_normalized()

bool craam::Transition::is_normalized ( ) const
inline
Returns
Whether the transition probabilities sum to 1.

◆ max_index()

long craam::Transition::max_index ( ) const
inline

Returns the maximal indexes involved in the transition.

Returns -1 for and empty transition.

◆ normalize()

void craam::Transition::normalize ( )
inline

Normalizes the transition probabilities to sum to 1.

Exception is thrown if the distribution sums to 0.

◆ probabilities_addto() [1/2]

void craam::Transition::probabilities_addto ( prec_t  scale,
numvec transition 
) const
inline

Scales transition probabilities according to the provided parameter and adds them to the provided vector.

This method ignores rewards.

Parameters
scaleMultiplicative modification of transition probabilities
transitionTransition probabilities being added to. This value is modified within the function.

◆ probabilities_addto() [2/2]

void craam::Transition::probabilities_addto ( prec_t  scale,
Transition transition 
) const
inline

Scales transition probabilities and rewards according to the provided parameter and adds them to the provided vector.

Parameters
scaleMultiplicative modification of transition probabilities
transitionTransition probabilities being added to. This value is modified within the function.

◆ probabilities_vector()

numvec craam::Transition::probabilities_vector ( size_t  size) const
inline

Constructs and returns a dense vector of probabilities, which includes 0 transition probabilities.

Parameters
sizeSize of the constructed vector

◆ rewards_vector()

numvec craam::Transition::rewards_vector ( size_t  size) const
inline

Constructs and returns a dense vector of rewards, which includes 0 transition probabilities.

Rewards for indices with zero transition probability are zero.

Parameters
sizeSize of the constructed vector

◆ size()

size_t craam::Transition::size ( ) const
inline

Returns the number of target states with non-zero transition probabilities.

◆ to_json()

string craam::Transition::to_json ( long  outcomeid = -1) const
inline

Returns a json representation of transition probabilities.

Parameters
outcomeidIncludes also outcome id

◆ value() [1/2]

prec_t craam::Transition::value ( numvec const &  valuefunction,
prec_t  discount,
numvec  probabilities 
) const
inline

Computes value for the transition and a value function.

When there are no target states, the function terminates with an error.

Parameters
valuefunctionValue function, or an arbitrary vector of values
discountDiscount factor, optional (default value 1)
probabilitiesCustom probability distribution. It must be of the same length as the number of nonzero transition probabilities. The length is NOT checked in a release build.

◆ value() [2/2]

prec_t craam::Transition::value ( numvec const &  valuefunction,
prec_t  discount = 1.0 
) const
inline

Computes value for the transition and a value function.

When there are no target states, the function terminates with an error.

Parameters
valuefunctionValue function, or an arbitrary vector of values
discountDiscount factor, optional (default value 1)

The documentation for this class was generated from the following file: