In distributed algorithms, communication is generally described in terms of message passing, either point-to-point, or using more elaborate schemes like broadcast and multicast. Processes "send" and "receive" messages which are transmitted reliably (or not), in order (or not) with bounded delay (or not). This description of communication in terms of messages is perfectly sound with respect to the reality of distributed computing: sending (or broadcasting) and receiving messages are the actual primitives used when distributed applications are programmed.
However, this corresponds to a very operational view of communication: what is described is what processes actually do to communicate. Operational descriptions are never good when it comes to program verification, where a lot of logical reasoning is involved. We prefer to rely on more abstract and more logical descriptions. Ideally, we don't want to describe what processes do to communicate, but rather the communication itself, i.e., what result from processes communication actions.
Moreover, we would like an abstract description of communication to be easy to use in proofs and logical reasoning. This requires the abstraction to be logically expressed and well integrated with the other parts of the logic used in correctness proofs (for us, UNITY logic).
Finally, it should be noticed than sending and receiving messages not only involves communication (transfer of data amongst processes) but also synchronization. For instance, sending and receiving a nonvalued token is a way for processes to synchronize their actions. No actual data from the state of one process is transfered to another process.
More or less, this is what our observation relation aims to be: an abstraction of communication, of communication only (not of synchronization), consistent with distributed architectures and well integrated with UNITY language, logic and proof system.
We introduce the observation relation in two different ways. Firstly, observation can be a logical specification of communication. It can then be considered as a temporal operator that extends UNITY temporal logic. It can be mixed with other logical operators and be part of proofs. Secondly, observation can be a communication mechanism. It is then part of the operational description of an algorithm and can be considered as a new statement in the UNITY language. In the first case, we speak of observation relation; in the second case, we speak of observation mechanism.
Having observation both in the language and in the logic is supposed to help derive observation relations (logical facts about the communication involved in an algorithm) from the operational description of an algorithm in which communication is described in terms of the observation mechanism. Then, the resulting observation relations can be mixed with other UNITY properties (other logical facts about the algorithm) to deduce the desired correctness properties.
The observation relation was defined as a reflexive, transitive and antisymmetric relation between state expressions. For expressions (of any type) x and y, it is denoted by xy. Intuitively, it means that values of x are past values of y, those values being taken in the same order by both expressions. Also, some values of y may never be taken by x, but x eventually takes more and more recent values of y. Finally, the delay between a change in y value and the corresponding (if any) change in x value in unbounded. Expression x is called the image of the relation; expression y is called the source.
Formally... Well, I won't write the formal definition of observation, because:
Let's just say that, if x is located on one process and y is located on another, then the first process gets information from the second. This information may be outdated, but it doesn't go back in time (w.r.t. information previously acquired) and it is guaranteed to eventually be updated to more recent information, even if there is no bound on how old the information may be. In other words, the two processes communicate, but the way this communication is actually achieved is not described.
This form of communication is very easy to implement with message passing, which we interpret as observation being compatible with distributed architectures. Therefore, one can always (well, under some conditions) assume that a given observation relation is true in a program. The relation then becomes part of the program description. This what we call observation mechanism. Observation, in this case, is a kind of new statement in UNITY language, which requires UNITY operational semantics to be modified. Once this is done, processes can use observation mechanism to communicate, instead of sending and receiving messages.
The nice thing about observation, is that it is a temporal operator, which can be mixed with other operators in specifications and proofs. For instance, as an example of a relationship between UNITY operators and observation:
In that example, a liveness UNITY property is deduced from the conjunction of a safety UNITY property and an observation. This is made possible by the liveness part ("more and more recent values") that is included in observation definition.
p unless q /\ q p => p leads-to q
Being able to use a unique and logical description of communication in algorithms, specifications and proofs actually helps in simplifying proofs themselves. Can we then conclude that observation is a complete success?
Well, not exactly. Things will work fine in proofs only if algorithms use the observation mechanism to communicate. Is it always possible to translate message passing communication into observation based communication? Technically, yes. However, the resulting description sometimes is not simpler than the original. In some cases, you cannot get rid of the FIFO sequences that are responsible for so much proof complexity when message based descriptions are used. When this is true, most of the benefit of using observation is lost (but not all, see below).
From our experience, observation seems to work well with monotonic variables. Actually, we have very few examples of observation based algorithms using nonmonotonic variables. (One of this few examples is a handshake based on mutual observations of two alternating single bits. This example has been extensively studied, extended and developed by Mamoun Filali. Surely, he can still provide you with 50+ page technical reports entirely dedicated to it...)
If you are looking for a philosophic reason why observation works better with monotonic variables, I can suggest one (but the rest of the group did not really follow me on that one): monotonic expressions implicitly contain their own causality. Observation, like message passing, does no transfer causality (precedence relationships between events located on different processes). If needed, causality can be explicitly formulated (using, for instance, logical clocks). But these formulations are quite operational and tend to kill the abstraction of observation. Unfortunately, we couldn't find a suitable abstraction for causal relationships. With monotonic variables, we don't need one, and things work fine.
Fortunately, distributed computing involves a lot of monotonic variables, such as clocks and event counters. Also, you can always replace a nonmonotonic expression by a difference between two monotonic expressions. For instance, a channel between two processes (nonmonotonic) can be replaced by the difference between the two histories (one on each process) of sent and received messages (monotonic).
This idea has been used by Paolo Sivilotti in his PhD. With his advisor Mani Chandy, they used a representation of communication based on a relationship between (monotonic) histories of message sendings and receivings. The relationship they defined, called follows, looks very much like a kind of observation, except that it is defined only for monotonic variables (which allows them to get a much nicer definition). In this case, communication is still described in terms of message passing, but in a more abstract and logical way. When I was in Caltech, working on composition, I also used (a modified version of) follows to describe communication between components.
For the record, I defended my PhD in November 1997, Paolo submitted his in December 1997, and we have never met. However, we both spent time in Caltech, working with Mani Chandy. We even had the same office (Jorg. 267)!
For more information on our observation relation, you van have a look at some papers.