A general robust Markov decision process. More...

#include <RMDP.hpp>

Public Types
typedef indvec	policy_det
	Decision-maker's policy: Which action to take in which state. More...

typedef vector< numvec >	policy_rand
	Nature's policy: Which outcome to take in which state. More...

Public Member Functions
	GRMDP (long state_count)
	Constructs the RMDP with a pre-allocated number of states. More...

	GRMDP ()
	Constructs an empty RMDP. More...

SType &	create_state (long stateid)
	Assures that the MDP state exists and if it does not, then it is created. More...

SType &	create_state ()
	Creates a new state at the end of the states. More...

size_t	state_count () const
	Number of states.

size_t	size () const
	Number of states.

const SType &	get_state (long stateid) const
	Retrieves an existing state.

const SType &	operator[] (long stateid) const
	Retrieves an existing state.

SType &	get_state (long stateid)
	Retrieves an existing state.

SType &	operator[] (long stateid)
	Retrieves an existing state.

const vector< SType > &	get_states () const

bool	is_normalized () const
	Check if all transitions in the process sum to one. More...

void	normalize ()
	Normalize all transitions to sum to one for all states, actions, outcomes. More...

template<typename Policy >
long	is_policy_correct (const Policy &policies) const
	Checks if the policy and nature's policy are both correct. More...

void	to_csv (ostream &output, bool header=true) const
	Saves the model to a stream as a simple csv file. More...

void	to_csv_file (const string &filename, bool header=true) const
	Saves the transition probabilities and rewards to a CSV file. More...

string	to_string () const
	Returns a brief string representation of the RMDP. More...

string	to_json () const
	Returns a json representation of the RMDP. More...

Protected Attributes
vector< SType >	states
	Internal list of states.

Detailed Description

template<class SType>
class craam::GRMDP< SType >

A general robust Markov decision process.

Contains methods for constructing and solving RMDPs.

Some general assumptions (may depend on the state and action classes):

Transition probabilities must be non-negative but do not need to add up to a specific value
Transitions with 0 probabilities may be omitted, except there must be at least one target state in each transition
State with no actions: A terminal state with value 0
Action with no outcomes: Terminates with an error for uncertain models, but assumes 0 return for regular models.
Outcome with no target states: Terminates with an error
Invalid actions are ignored
Behavior for a state with all invalid actions is not defined

Template Parameters

SType Type of state, determines s-rectangularity or s,a-rectangularity and also the type of the outcome and action constraints

Member Typedef Documentation

◆ policy_det

template<class SType>

typedef indvec craam::GRMDP< SType >::policy_det

Decision-maker's policy: Which action to take in which state.

◆ policy_rand

template<class SType>

typedef vector<numvec> craam::GRMDP< SType >::policy_rand

Nature's policy: Which outcome to take in which state.

Constructor & Destructor Documentation

◆ GRMDP() [1/2]

template<class SType>

craam::GRMDP< SType >::GRMDP ( long state_count )

inline

Constructs the RMDP with a pre-allocated number of states.

All states are initially terminal.

Parameters

state_count The initial number of states, which dynamically increases as more transitions are added. All initial states are terminal.

◆ GRMDP() [2/2]

template<class SType>

craam::GRMDP< SType >::GRMDP ( )

inline

Constructs an empty RMDP.

Member Function Documentation

◆ create_state() [1/2]

template<class SType>

SType& craam::GRMDP< SType >::create_state ( long stateid )

inline

Assures that the MDP state exists and if it does not, then it is created.

States with intermediate ids are also created

Returns: The new state

◆ create_state() [2/2]

template<class SType>

SType& craam::GRMDP< SType >::create_state ( )

inline

Creates a new state at the end of the states.

Returns: The new state

◆ get_states()

template<class SType>

const vector<SType>& craam::GRMDP< SType >::get_states ( ) const

inline

Returns: list of all states

◆ is_normalized()

template<class SType>

bool craam::GRMDP< SType >::is_normalized ( ) const

inline

Check if all transitions in the process sum to one.

Note that if there are no actions, or no outcomes for a state, the RMDP still may be normalized.

Returns: True if and only if all transitions are normalized.

◆ is_policy_correct()

template<class SType>

template<typename Policy >

long craam::GRMDP< SType >::is_policy_correct ( const Policy & policies ) const

inline

Checks if the policy and nature's policy are both correct.

Action and outcome can be arbitrary for terminal states.

Template Parameters

Policy Type of the policy. Either a single policy for the standard MDP evaluation, or a pair of a deterministic policy and a randomized policy of the nature

Parameters

policies The policy (indvec) or the pair of the policy and the policy of nature (pair<indvec,vector<numvec> >). The nature is typically a randomized policy

Returns: If incorrect, the function returns the first state with an incorrect action and outcome. Otherwise the function return -1.

◆ normalize()

template<class SType>

void craam::GRMDP< SType >::normalize ( )

inline

Normalize all transitions to sum to one for all states, actions, outcomes.

◆ to_csv()

template<class SType>

void craam::GRMDP< SType >::to_csv	(	ostream &	output,
		bool	header = `true`
	)		const

inline

Saves the model to a stream as a simple csv file.

States, actions, and outcomes are identified by 0-based ids. Columns are separated by commas, and rows by new lines.

The file is formatted with the following columns: idstatefrom, idaction, idoutcome, idstateto, probability, reward

Exported and imported MDP will be be slightly different. Since action/transitions will not be exported if there are no actions for the state. However, when there is data for action 1 and action 3, action 2 will be created with no outcomes.

Note that outcome distributions are not saved.

Parameters

output	Output for the stream
header	Whether the header should be written as the first line of the file represents the header.

◆ to_csv_file()

template<class SType>

void craam::GRMDP< SType >::to_csv_file	(	const string &	filename,
		bool	header = `true`
	)		const

inline

Saves the transition probabilities and rewards to a CSV file.

Parameters

filename	Name of the file
header	Whether to create a header of the file too

◆ to_json()

template<class SType>

string craam::GRMDP< SType >::to_json ( ) const

inline

Returns a json representation of the RMDP.

This method is mostly suitable to analyzing small RMDPs.

◆ to_string()

template<class SType>

string craam::GRMDP< SType >::to_string ( ) const

inline

Returns a brief string representation of the RMDP.

This method is mostly suitable for analyzing small RMDPs.

The documentation for this class was generated from the following file:

craam/RMDP.hpp

Public Types

Public Member Functions

Protected Attributes

Detailed Description

template<class SType> class craam::GRMDP< SType >

Member Typedef Documentation

◆ policy_det

◆ policy_rand

Constructor & Destructor Documentation

◆ GRMDP() [1/2]

◆ GRMDP() [2/2]

Member Function Documentation

◆ create_state() [1/2]

◆ create_state() [2/2]

◆ get_states()

◆ is_normalized()

◆ is_policy_correct()

◆ normalize()

◆ to_csv()

◆ to_csv_file()

◆ to_json()

◆ to_string()

template<class SType>
class craam::GRMDP< SType >