Files for "A Modular Framework for Verifying Versatile Distributed Systems"

Original files for 4PAD 2018, expanded for JLAMP submission.

The example of the paper

See below for its full description.

Physical micro-models

Other micro-models

Other examples


A TLA+ installation (the TLA tools or the TLA toolbox).

Description of the example

Let us consider a conference reviewing system. The peers are the authors, the chairs of the program committee and the reviewers. Authors send their papers to all the PC chairs (multicast to the chairs). An author can submit only one paper (voting micromodel with bound 1 on the submission channel). Each chairperson attributes a paper number and takes responsibility for a part of the papers, based on this number. In order that all chairs attribute the same number to a given paper, and without internal coordination between the PC chairs, the authors use a totally ordered multicast so that the papers are delivered in the same order to all the chairs (fifo n-1 ordering model). After the deadline has passed, the chairs reject new submissions (point-to-point communication to the author). After the deadline, each chair independently sends its papers to some of the reviewers (bounded multicast to the reviewers), waits for the reviews (convergecast from the reviewers to the chair in charge), and sends the acceptance result to the author (point-to-point). The system must ensure that it does not deadlock and that every author eventually receives an answer (either rejection for a late paper, or acceptance result if reviewed). This system exposes both strict ordering constraints (submissions sent to the chairs), and high interleaving (each reviewer is independently handling the papers it has received). The system has been described in the PlusCal Algorithm Language, which is translated by the TLA+ tools to a TLA+ specification. It can then be checked with the TLC model checker. Results have shown that the message cap on the number of messages in transit is instrumental to avoid state explosion as it ensures that messages are not delayed for too long. During the development of the system, several bugs were found. For instance, the logic to split the papers among the chairs was faulty with an odd number of chairs and some authors were never receiving their acceptance result; in some cases, the same paper was sent twice to the reviewers, and an unfortunate (but legal) interleaving in the reception of the reviews led to two acceptance messages to the same author. This system, albeit simple, already experiences enough communication interactions to warrant formal verification.