Cassandra Architecture

High Level Features

Dataset partitioning using consistent hashing
Multi-master replication
Tuneable levels of replication and consistency
Distributed cluster management
Distributed failure detection
Incremental horizontal scaling on commodity hardware

Cassandra partitions (creates shards) data across nodes using consistent hashing. In naive data hashing, keys are allocated to buckets by hashing the key modulo the number of buckets. Cassandra takes a different approach by first hashing each node to one or more values on a continuous hash ring. These hash values representing each node are referred to as tokens in Cassandra. Once tokens are created, Cassandra then is able to map data points to tokens on that same hash ring. Specifically, Cassandra will receive rows, hash the primary keys of each row, and map those hash values to the hash ring. Lastly, Cassandra will map those data points to nodes by rounding their mapped hash values to the nearest token in a clockwise motion on the ring.

The main difference of consistent hashing to naive data hashing is that when the number of nodes (buckets) to hash into changes, consistent hashing only has to move a small fraction of the keys.

The use of consistent hashing for partitioning makes Cassandra a scalable and available column-family store. There are other features included in the hashing algorithm to improve potential issues with consistency, such as virtual nodes, quorums, and compaction.

Multi-master Replication: Versioned Data and Tuneable Consistency

Cassandra replicates every partition of data to many nodes across the cluster to maintain high availability and durability. When a mutation occurs, the coordinator hashes the partition key to determine the token range the data belongs to and then replicates the mutation to the replicas of that data according to the Replication Strategy - which determine which physical nodes act as replicas for a given token range.

NetworkTopologyStrategy
SimpleStrategy

Data Versioning - LWW (Last Write Wins)

Cassandra uses mutation timestamp versioning to guarantee eventual consistency of data. Specifically all mutations that enter the system do so with a timestamp provided either from a client clock or, absent a client provided timestamp, from the coordinator node’s clock. Updates resolve according to the conflict resolution rule of last write wins. Cassandra’s correctness does depend on these clocks, so make sure a proper time synchronisation process is running such as NTP.