EpTO: An epidemic total order algorithm for large-scale distributed systems

The ordering of events is a fundamental problem of distributed computing and has been extensively studied over several decades. From all the available orderings, total ordering is of particular interest as it provides a powerful abstraction for building reliable distributed applications. Unfortunate...

Full description

Bibliographic Details
Main Author: Matos, Miguel Ângelo Marques (author)
Other Authors: Mercier, Hugues (author), Felber, Pascal (author), Oliveira, Rui Carlos Mendes de (author), Pereira, José (author)
Format: conferencePaper
Language:eng
Published: 2015
Subjects:
Online Access:http://hdl.handle.net/1822/52880
Country:Portugal
Oai:oai:repositorium.sdum.uminho.pt:1822/52880
Description
Summary:The ordering of events is a fundamental problem of distributed computing and has been extensively studied over several decades. From all the available orderings, total ordering is of particular interest as it provides a powerful abstraction for building reliable distributed applications. Unfortunately, deterministic total order algorithms scale poorly and are therefore unfit for modern large-scale applications. The main contribution of this paper is EPTO, a total order algorithm with probabilistic agreement that scales both in the number of processes and events. EPTO provides deterministic safety and probabilistic liveness: integrity, total order and validity are always preserved, while agreement is achieved with arbitrarily high probability. We show that EPTO is well-suited for large-scale dynamic distributed systems: it does not require a global clock nor synchronized processes, and it is highly robust even when the network suffers from large delays and significant churn and message loss.