Optimistic Recovery in Distributed Systems       


Middleware and Datastores Accomplishment | 1985

IBM researchers: Robert E. Strom, Shaula Yemini

Where the work was done: IBM T.J. Watson Research Center

What we accomplished: From the paper abstract: In optimistic recovery communication, computation and checkpointing proceed asynchronously. Synchronization is replaced by causal dependency trocking, which enables a posteriori reconstruction of a consistent distributed system state following a failure using process rollback and message replay. Because there is no synchronization among computation, communication, and checkpointing, optimistic recovery can tolerate the failure of an arbitrary number of processors and yields better throughput and response time than other general recovery techniques whenever failures are infrequent.

Related links: Optimistic Recovery in Distributed Systems (Transactions on Computing Systems)

BACK TO MIDDLEWARE and DATASTORES
BACK TO IBM RESEARCH ACCOMPLISHMENTS