Review of "Spanner: Google’s Globally-Distributed Database"

14 Oct 2015

Review of "Spanner: Google’s Globally-Distributed Database"

Systems like Bigtable has been complained for it's difficult to use for some kind of applications which has complex and evolving schemas or those that requires strong consistency in the presence of wide-area replications. Spanner is a database that shards data across many sets of Paxos state machines in data centers spread across the world. It automatically reshards data across machines as the amount of data or the number of servers changes.

Spanner provides externally consistent reads and writes, and globally-consistent reads across the databases at a timestamp which is enabled by having a globally meaningful commit timestamps to distributed transactions. This key enabler of this timestamps is a new TrueTime API and its implementation. The API directly exposes clock uncertainty and the guarantees on Spanner's timestamps depend on the bounds that the implementation provides. If the uncertainty is large, Spanner slows down to wait out the uncertainty. Uncertainty is typically less than 10ms by using GPS and atomic clocks as references.

A spanner deployment is called a universe. Spanner is organized as a set of zones which is the unit of administrative deployment. Each zone has a zonemaster and a hundred to several thousand spanservers. Zonemasters assign data to spanservers where spanservers serve data to clients. A universe also has a universemaster ad a placement driver, where the latter moves data across zones.

Data is organized as directories or more accurately buckets. It's a contiguous keys that share a common prefix. A directory is a unit of data placement. Spanner supports SQL like queries plus some extension to support protocol-buffer-valued fields. Every table is required to have an ordered set of one or more primary-key columns.

Will the paper be influential in 10 years? It proposes a very creatively way to expose the whole cluster with a consistent view of the current timestamp and uses it to achieve consistency of transactions. The guarantees and rich features it provides also makes it possible for other systems like F1 to build on top of it.