|
| Transition (const indvec &indices, const numvec &probabilities, const numvec &rewards) |
| Creates a single transition from raw data. More...
|
|
| Transition (const indvec &indices, const numvec &probabilities) |
| Creates a single transition from raw data with uniformly zero rewards. More...
|
|
| Transition (const numvec &probabilities) |
| Creates a single transition from raw data with uniformly zero rewards, where destination states are indexed automatically starting with 0. More...
|
|
void | add_sample (long stateid, prec_t probability, prec_t reward) |
| Adds a single transitions probability to the existing probabilities. More...
|
|
prec_t | sum_probabilities () const |
|
void | normalize () |
| Normalizes the transition probabilities to sum to 1. More...
|
|
bool | is_normalized () const |
|
prec_t | value (numvec const &valuefunction, prec_t discount, numvec probabilities) const |
| Computes value for the transition and a value function. More...
|
|
prec_t | value (numvec const &valuefunction, prec_t discount=1.0) const |
| Computes value for the transition and a value function. More...
|
|
prec_t | mean_reward (const numvec &probabilities) const |
| Computes the mean return from this transition with custom transition probabilities.
|
|
prec_t | mean_reward () const |
| Computes the mean return from this transition.
|
|
size_t | size () const |
| Returns the number of target states with non-zero transition probabilities. More...
|
|
bool | empty () const |
| Checks if the transition is empty. More...
|
|
long | max_index () const |
| Returns the maximal indexes involved in the transition. More...
|
|
void | probabilities_addto (prec_t scale, numvec &transition) const |
| Scales transition probabilities according to the provided parameter and adds them to the provided vector. More...
|
|
void | probabilities_addto (prec_t scale, Transition &transition) const |
| Scales transition probabilities and rewards according to the provided parameter and adds them to the provided vector. More...
|
|
numvec | probabilities_vector (size_t size) const |
| Constructs and returns a dense vector of probabilities, which includes 0 transition probabilities. More...
|
|
numvec | rewards_vector (size_t size) const |
| Constructs and returns a dense vector of rewards, which includes 0 transition probabilities. More...
|
|
const indvec & | get_indices () const |
| Indices with positive probabilities. More...
|
|
long | get_index (long k) |
| Index of the k-th state with non-zero probability.
|
|
const numvec & | get_probabilities () const |
| Returns list of positive probabilities for indexes returned by get_indices. More...
|
|
const numvec & | get_rewards () const |
| Rewards for indices with positive probabilities returned by get_indices. More...
|
|
void | set_reward (long sampleid, prec_t reward) |
| Sets the reward for a transition to a particular state.
|
|
prec_t | get_reward (long sampleid) const |
| Gets the reward for a transition to a particular state.
|
|
string | to_json (long outcomeid=-1) const |
| Returns a json representation of transition probabilities. More...
|
|
Represents sparse transition probabilities and rewards from a single state.
The class can be also used to represent a generic sparse distribution.
The destination indexes are sorted increasingly (as added). This makes it simpler to aggregate multiple transition probabilities and should also make value iteration more cache friendly. However, transitions need to be added with increasing IDs to prevent excessive performance degradation.
void craam::Transition::add_sample |
( |
long |
stateid, |
|
|
prec_t |
probability, |
|
|
prec_t |
reward |
|
) |
| |
|
inline |
Adds a single transitions probability to the existing probabilities.
If the transition to a state does not exist, then it is simply added to the list. If the transition to the desired state already exists, then the transition probability is added and the reward is updated as a weighted combination. Let \( p(s) \) and \( r(s) \) be the current transition probability and reward respectively. The updated transition probability and reward are:
When the function is called multiple times with \( p_1 \ldots p_n \) and \( r_1 \ldots r_n \) for a single \( s \) then:
Transition probabilities are not checked to sum to one.
- Parameters
-
stateid | ID of the target state |
probability | Probability of transitioning to this state |
reward | The reward associated with the transition |