March 11, 2015 at 03:05PM
"at best you [can] provide a response latency no less than the round-trip time inside that system" #readingToday  

All of these different coordination systems—including Paxos, Viewstamped Replication, Zab/Zookeeper, and Raft—provide ways of defining an ordering of events across a distributed system even though physical time cannot safely be used for that purpose. These protocols, though, absolutely can be used for that purpose: to provide a unified timeline across a system. You can think of coordination as providing a logical surrogate for "now." When used in that way, however, these protocols have a cost, resulting from something they all fundamentally have in common: constant communication. For example, if you coordinate an ordering for all of the things that happen in your distributed system, then at best you are able to provide a response latency no less than the round-trip time (two sequential message deliveries) inside that system.

Depending on the details of your coordination system, there may be similar bounds on throughput as well. The designer of a system with aggressive performance demands may wish to do things right but find the cost of constant coordination to be prohibitive. This is especially the case when hosts or networks are expected to fail often, as many coordination systems provide only "eventual liveness" and can make progress only during times of minimal trouble. Even in those rare times when everything is working perfectly, however, the communication cost of constant coordination might simply be too high.

Distributed Computing · Download PDF version of this article. March 10, 2015. Volume 13, issue 3. There is No Now. Problems with simultaneity in distributed systems. Justin Sheehy. "Now." The time elapsed between when I wrote that word and when you read it was at least a couple of weeks.