EpTO: An epidemic total order algorithm for large-scale distributed systems

The ordering of events is a fundamental problem of distributed computing and has been extensively studied over several decades. From all the available orderings, total ordering is of particular interest as it provides a powerful abstraction for building reliable distributed applications. Unfortunate...

ver descrição completa

Detalhes bibliográficos
Autor principal: Matos, Miguel Ângelo Marques (author)
Outros Autores: Mercier, Hugues (author), Felber, Pascal (author), Oliveira, Rui Carlos Mendes de (author), Pereira, José (author)
Formato: conferencePaper
Idioma:eng
Publicado em: 2015
Assuntos:
Texto completo:http://hdl.handle.net/1822/52880
País:Portugal
Oai:oai:repositorium.sdum.uminho.pt:1822/52880
Descrição
Resumo:The ordering of events is a fundamental problem of distributed computing and has been extensively studied over several decades. From all the available orderings, total ordering is of particular interest as it provides a powerful abstraction for building reliable distributed applications. Unfortunately, deterministic total order algorithms scale poorly and are therefore unfit for modern large-scale applications. The main contribution of this paper is EPTO, a total order algorithm with probabilistic agreement that scales both in the number of processes and events. EPTO provides deterministic safety and probabilistic liveness: integrity, total order and validity are always preserved, while agreement is achieved with arbitrarily high probability. We show that EPTO is well-suited for large-scale dynamic distributed systems: it does not require a global clock nor synchronized processes, and it is highly robust even when the network suffers from large delays and significant churn and message loss.