Apache Cassandra is a NoSQL, distributed database system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. It offers multi datacenter clustering with asynchronous masterless replication, and near linear scalability. [1]
NoSQL design provides scalability and high availability, instead of ACID (Atomicity, Consistency, Isolation, Durability) [2] guarantee, like other more traditional RDBMS solutions. Cassandra employs the BASE (Basically Available, Soft-state, Eventual consistency) [3] principles which puts it in between the "available" and "partition tolerant" arm of the CAP theorem [4] triangle though such classifications are mostly meant to help answering the question "what is the default behaviour of the distributed system when a partition happens":
A Cassandra cluster is called a ring - each node consists of multiple virtual nodes (vnodes) responsible for a single continuous range of rows with token values (a hash value of a row key). Cassandra is a peer-to-peer system where data is distributed among all nodes in the ring. Each node exchanges information across the cluster every second using a gossip protocol. A partitioner determines how to distribute the data across the nodes in the cluster and which node to place the first copy of data on.
When a client sends a write request it can connect to any node in the ring. That node is called the coordinator node. In turn it delegates the write request to a StorageProxy service, which determines what nodes are responsible for that data. It identifies the nodes using a mechanism called a Snitch. A Snitch defines groups of machines that the replication strategy uses to place replicas of the data. Once the replica nodes are identified the coordinator node send a RowMutation message to them and then waits for a confirmation that the data was written. It only waits for some nodes to confirm, based on a pre-configured consistency level. If the nodes are in multiple datacenters the message is send to one replica in each data center with a special header telling it to forward the request to other nodes in that data center. The nodes that receive the RowMutation message first append it to the commit log, then to a MemTable and finally the MemTable is flushed to disk in a structure called SSTable. Periodically the SSTables are merged in a process called compaction.
When a client needs to read data back, it again connects to any node, the StorageProxy gets a list of nodes containing the requested key based on the replication strategy, the proxy node sorts the returned candidate nodes based on proximity using Snitch function (configurable). Once a node is selected the read request is forwarded to it for execution. That node then first attempts to read the data from its MemTable. If the data is not in memory Cassandra then looks into a SSTable on disk utilizing a bloom filter. At the same time, other nodes that are responsible for storing the same data will respond back with just a digest, without the actual data. If the digest does not match on some of the nodes, data repair process is started and those nodes will eventually get the latest data and become consistent. For further information on the architecture of Cassandra refer to [5].
The data in Cassandra is stored in a nested hashmap (a hashmap containing a hash map, which is basically data structure with key-value pairs) and it can be visualized as the following:
The keyspace is similar to a database and it stores the column families, along with other properties like the replication factor and replica placement strategies. The properties of the keyspace apply to all tables contained within the keyspace.
The column family is similar to a table and contain a collection of rows, where each row contains cells.
A cell is the smallest data unit (a triplet) that holds data in the form of "key:value:time". The timestamp is used to resolve consistency discrepancies during data repairs from inconsistent digests.
The row key uniquely identifies a row. Since each node in a ring contains only a subset of rows (the rows are distributed among the nodes) the row keys are sharded as well.
With all this in mind let's deploy two nodes, single DC cluster. First let's install the prerequisites:
File: gistfile1.txt
-------------------
[cassandra-nodes]$ lsb_release -d
Description: Ubuntu 14.04.4 LTS
[cassandra-nodes]$ gpg --keyserver pgp.mit.edu --recv-keys F758CE318D77295D
[cassandra-nodes]$ gpg --export --armor F758CE318D77295D | sudo apt-key add -
[cassandra-nodes]$ gpg --keyserver pgp.mit.edu --recv-keys 2B5C1B00
[cassandra-nodes]$ gpg --export --armor 2B5C1B00 | sudo apt-key add -
[cassandra-nodes]$ gpg --keyserver pgp.mit.edu --recv-keys 0353B12C
[cassandra-nodes]$ gpg --export --armor 0353B12C | sudo apt-key add -
[cassandra-nodes]$ add-apt-repository ppa:webupd8team/java
[cassandra-nodes]$ apt-get update
[cassandra-nodes]$ apt-get install oracle-java8-installer
[cassandra-nodes]$ apt-get install libjna-java
[cassandra-nodes]$ update-alternatives --list java
/usr/lib/jvm/java-8-oracle/jre/bin/java
[cassandra-nodes]$ java -version
java version "1.8.0_91"
Java(TM) SE Runtime Environment (build 1.8.0_91-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.91-b14, mixed mode)
Then let's install Cassandra:
File: gistfile1.txt
-------------------
[cassandra-nodes]$ echo "deb http://www.apache.org/dist/cassandra/debian 35x main" > /etc/apt/sources.list.d/cassandra.list
[cassandra-nodes]$ apt-get update
[cassandra-nodes]$ apt-get install cassandra
The main configuration file is very well documented and most of the defaults are quite sensible. The few changes for the purpose of this blog are as follows:
File: gistfile1.txt
-------------------
[cassandra-nodes]$ cat /etc/cassandra/cassandra.yaml
cluster_name: 'TestCluster'
authenticator: PasswordAuthenticator
authorizer: CassandraAuthorizer
role_manager: CassandraRoleManager
partitioner: org.apache.cassandra.dht.Murmur3Partitioner
data_file_directories:
- /var/lib/cassandra/data
commitlog_directory: /var/lib/cassandra/commitlog
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "10.176.64.41"
disk_optimization_strategy: spinning
listen_interface: eth1
rpc_interface: eth1
endpoint_snitch: GossipingPropertyFileSnitch
[cassandra-nodes]$ cat /etc/cassandra/cassandra-rackdc.properties
dc=iad3
rack=rack1
[cassandra-nodes]$ /etc/init.d/cassandra start
[cassandra-node-1]$ cat /var/log/cassandra/system.log
INFO [main] 2016-05-29 18:53:46,794 CassandraDaemon.java:428 - Par Survivor Space Heap memory: init = 20971520(20480K) used = 0(0K) committed = 20971520(20480K) max = 20971520(20480K)
INFO [main] 2016-05-29 18:53:46,795 CassandraDaemon.java:428 - CMS Old Gen Heap memory: init = 836763648(817152K) used = 0(0K) committed = 836763648(817152K) max = 836763648(817152K)
INFO [main] 2016-05-29 18:53:46,795 CassandraDaemon.java:430 - Classpath: /etc/cassandra:/usr/share/cassandra/lib/ST4-4.0.8.jar:/usr/cassandra/hs_err_1464548024.log]
INFO [main] 2016-05-29 18:53:46,853 CLibrary.java:126 - JNA mlockall successful
WARN [main] 2016-05-29 18:53:46,854 StartupChecks.java:118 - jemalloc shared library could not be preloaded to speed up memory allocations
WARN [main] 2016-05-29 18:53:46,854 StartupChecks.java:150 - JMX is not enabled to receive remote connections. Please see cassandra-env.sh for more info.
INFO [main] 2016-05-29 18:53:46,856 SigarLibrary.java:44 - Initializing SIGAR library
WARN [main] 2016-05-29 18:53:46,869 SigarLibrary.java:174 - Cassandra server running in degraded mode. Is swap disabled? : true, Address space adequate? : true, nofile limit adequate? : true, nproc limit adequate? : false
INFO [main] 2016-05-29 18:53:47,696 ColumnFamilyStore.java:395 - Initializing system.IndexInfo
INFO [main] 2016-05-29 18:53:49,147 ColumnFamilyStore.java:395 - Initializing system.batches
INFO [main] 2016-05-29 18:53:49,157 ColumnFamilyStore.java:395 - Initializing system.paxos
INFO [main] 2016-05-29 18:53:49,172 ColumnFamilyStore.java:395 - Initializing system.local
INFO [SSTableBatchOpen:2] 2016-05-29 18:53:49,194 BufferPool.java:226 - Global buffer pool is enabled, when pool is exahusted (max is 512 mb) it will allocate on heap
INFO [main] 2016-05-29 18:53:49,234 CacheService.java:113 - Initializing key cache with capacity of 48 MBs.
INFO [main] 2016-05-29 18:53:49,244 CacheService.java:135 - Initializing row cache with capacity of 0 MBs
INFO [main] 2016-05-29 18:53:49,246 CacheService.java:164 - Initializing counter cache with capacity of 24 MBs
INFO [main] 2016-05-29 18:53:49,247 CacheService.java:175 - Scheduling counter cache save to every 7200 seconds (going to save all keys).
INFO [main] 2016-05-29 18:53:49,272 ColumnFamilyStore.java:395 - Initializing system.peers
INFO [main] 2016-05-29 18:53:49,297 ColumnFamilyStore.java:395 - Initializing system.peer_events
INFO [main] 2016-05-29 18:53:49,308 ColumnFamilyStore.java:395 - Initializing system.range_xfers
INFO [main] 2016-05-29 18:53:49,316 ColumnFamilyStore.java:395 - Initializing system.compaction_history
INFO [main] 2016-05-29 18:53:49,331 ColumnFamilyStore.java:395 - Initializing system.sstable_activity
INFO [main] 2016-05-29 18:53:49,344 ColumnFamilyStore.java:395 - Initializing system.size_estimates
INFO [main] 2016-05-29 18:53:49,355 ColumnFamilyStore.java:395 - Initializing system.available_ranges
INFO [main] 2016-05-29 18:53:49,365 ColumnFamilyStore.java:395 - Initializing system.views_builds_in_progress
INFO [main] 2016-05-29 18:53:49,371 ColumnFamilyStore.java:395 - Initializing system.built_views
INFO [main] 2016-05-29 18:53:49,377 ColumnFamilyStore.java:395 - Initializing system.hints
INFO [main] 2016-05-29 18:53:49,384 ColumnFamilyStore.java:395 - Initializing system.batchlog
INFO [main] 2016-05-29 18:53:49,390 ColumnFamilyStore.java:395 - Initializing system.schema_keyspaces
INFO [main] 2016-05-29 18:53:49,395 ColumnFamilyStore.java:395 - Initializing system.schema_columnfamilies
INFO [main] 2016-05-29 18:53:49,401 ColumnFamilyStore.java:395 - Initializing system.schema_columns
INFO [main] 2016-05-29 18:53:49,406 ColumnFamilyStore.java:395 - Initializing system.schema_triggers
INFO [main] 2016-05-29 18:53:49,411 ColumnFamilyStore.java:395 - Initializing system.schema_usertypes
INFO [main] 2016-05-29 18:53:49,417 ColumnFamilyStore.java:395 - Initializing system.schema_functions
INFO [main] 2016-05-29 18:53:49,423 ColumnFamilyStore.java:395 - Initializing system.schema_aggregates
INFO [main] 2016-05-29 18:53:50,136 StorageService.java:600 - Populating token metadata from system tables
INFO [main] 2016-05-29 18:53:50,298 StorageService.java:607 - Token metadata: Normal Tokens:
/10.176.64.41:[-9093342176872828671, ...]
INFO [main] 2016-05-29 18:53:50,317 ColumnFamilyStore.java:395 - Initializing system_schema.keyspaces
INFO [main] 2016-05-29 18:53:50,367 ColumnFamilyStore.java:395 - Initializing system_schema.tables
INFO [main] 2016-05-29 18:53:50,416 ColumnFamilyStore.java:395 - Initializing system_schema.columns
INFO [main] 2016-05-29 18:53:50,449 ColumnFamilyStore.java:395 - Initializing system_schema.triggers
INFO [main] 2016-05-29 18:53:50,497 ColumnFamilyStore.java:395 - Initializing system_schema.dropped_columns
INFO [main] 2016-05-29 18:53:50,515 ColumnFamilyStore.java:395 - Initializing system_schema.views
INFO [main] 2016-05-29 18:53:50,534 ColumnFamilyStore.java:395 - Initializing system_schema.types
INFO [main] 2016-05-29 18:53:50,552 ColumnFamilyStore.java:395 - Initializing system_schema.functions
INFO [main] 2016-05-29 18:53:50,565 ColumnFamilyStore.java:395 - Initializing system_schema.aggregates
INFO [main] 2016-05-29 18:53:50,578 ColumnFamilyStore.java:395 - Initializing system_schema.indexes
INFO [main] 2016-05-29 18:53:50,788 ColumnFamilyStore.java:395 - Initializing system_distributed.parent_repair_history
INFO [main] 2016-05-29 18:53:50,793 ColumnFamilyStore.java:395 - Initializing system_distributed.repair_history
INFO [main] 2016-05-29 18:53:50,803 ColumnFamilyStore.java:395 - Initializing system_auth.resource_role_permissons_index
INFO [main] 2016-05-29 18:53:50,830 ColumnFamilyStore.java:395 - Initializing system_auth.role_members
INFO [main] 2016-05-29 18:53:50,838 ColumnFamilyStore.java:395 - Initializing system_auth.role_permissions
INFO [main] 2016-05-29 18:53:50,852 ColumnFamilyStore.java:395 - Initializing system_auth.roles
INFO [main] 2016-05-29 18:53:50,866 ColumnFamilyStore.java:395 - Initializing system_traces.events
INFO [main] 2016-05-29 18:53:50,870 ColumnFamilyStore.java:395 - Initializing system_traces.sessions
INFO [pool-2-thread-1] 2016-05-29 18:53:50,875 AutoSavingCache.java:189 - reading saved cache /var/lib/cassandra/saved_caches/KeyCache-d.db
INFO [pool-2-thread-1] 2016-05-29 18:53:50,890 AutoSavingCache.java:165 - Completed loading (17 ms; 25 keys) KeyCache cache
INFO [main] 2016-05-29 18:53:50,906 CommitLog.java:171 - Replaying /var/lib/cassandra/commitlog/CommitLog-6-1464374244502.log, /var/lib/cassandra/commitlog/CommitLog-6-1464374244503.log
INFO [ScheduledTasks:1] 2016-05-29 18:53:51,729 TokenMetadata.java:448 - Updating topology for all endpoints that have changed
INFO [main] 2016-05-29 18:53:52,088 CommitLog.java:173 - Log replay complete, 43 replayed mutations
INFO [main] 2016-05-29 18:53:52,093 StorageService.java:600 - Populating token metadata from system tables
INFO [main] 2016-05-29 18:53:52,114 StorageService.java:607 - Token metadata: Normal Tokens:
/10.176.64.41:[-9093342176872828671, ...]
INFO [main] 2016-05-29 18:53:52,197 StorageService.java:618 - Cassandra version: 3.5
INFO [main] 2016-05-29 18:53:52,198 StorageService.java:619 - Thrift API version: 20.1.0
INFO [main] 2016-05-29 18:53:52,199 StorageService.java:620 - CQL supported versions: 3.4.0 (default: 3.4.0)
INFO [main] 2016-05-29 18:53:52,274 IndexSummaryManager.java:85 - Initializing index summary manager with a memory pool size of 48 MB and a resize interval of 60 minutes
INFO [main] 2016-05-29 18:53:52,276 StorageService.java:639 - Loading persisted ring state
INFO [main] 2016-05-29 18:53:52,291 StorageService.java:828 - Starting up server gossip
INFO [main] 2016-05-29 18:53:52,366 TokenMetadata.java:429 - Updating topology for /10.176.65.71
INFO [main] 2016-05-29 18:53:52,368 TokenMetadata.java:429 - Updating topology for /10.176.65.71
INFO [main] 2016-05-29 18:53:52,404 MessagingService.java:557 - Starting Messaging Service on /10.176.65.71:7000 (eth1)
INFO [MessagingService-Incoming-/10.176.64.41] 2016-05-29 18:53:52,459 ApproximateTime.java:44 - Scheduling approximate time-check task with a precision of 10 milliseconds
INFO [main] 2016-05-29 18:53:52,480 StorageService.java:1003 - Using saved tokens [-1101707182484276762, ...]
INFO [HANDSHAKE-/10.176.64.41] 2016-05-29 18:53:52,518 OutboundTcpConnection.java:514 - Handshaking version with /10.176.64.41
INFO [GossipStage:1] 2016-05-29 18:53:52,577 Gossiper.java:1028 - Node /10.176.64.41 has restarted, now UP
INFO [HANDSHAKE-/10.176.64.41] 2016-05-29 18:53:52,580 OutboundTcpConnection.java:514 - Handshaking version with /10.176.64.41
INFO [GossipStage:1] 2016-05-29 18:53:52,589 StorageService.java:2081 - Node /10.176.64.41 state jump to NORMAL
INFO [SharedPool-Worker-1] 2016-05-29 18:53:52,593 Gossiper.java:994 - InetAddress /10.176.64.41 is now UP
INFO [main] 2016-05-29 18:53:52,640 StorageService.java:2081 - Node /10.176.65.71 state jump to NORMAL
INFO [GossipStage:1] 2016-05-29 18:53:52,655 TokenMetadata.java:429 - Updating topology for /10.176.64.41
INFO [GossipStage:1] 2016-05-29 18:53:52,657 TokenMetadata.java:429 - Updating topology for /10.176.64.41
INFO [main] 2016-05-29 18:53:52,677 AuthCache.java:172 - (Re)initializing CredentialsCache (validity period/update interval/max entries) (2000/2000/1000)
INFO [main] 2016-05-29 18:53:52,682 CassandraDaemon.java:639 - Waiting for gossip to settle before accepting client requests...
WARN [GossipTasks:1] 2016-05-29 18:53:53,363 FailureDetector.java:287 - Not marking nodes down due to local pause of 5620946758 > 5000000000
INFO [main] 2016-05-29 18:54:00,685 CassandraDaemon.java:670 - No gossip backlog; proceeding
INFO [main] 2016-05-29 18:54:00,764 NativeTransportService.java:70 - Netty using native Epoll event loop
INFO [main] 2016-05-29 18:54:00,813 Server.java:161 - Using Netty Version: [netty-buffer=netty-buffer-4.0.23.Final.208198c, netty-codec=netty-codec-4.0.23.Final.208198c, netty-codec-http=netty-codec-http-4.0.23.Final.208198c, netty-codec-socks=netty-codec-socks-4.0.23.Final.208198c, netty-common=netty-common-4.0.23.Final.208198c, netty-handler=netty-handler-4.0.23.Final.208198c, netty-transport=netty-transport-4.0.23.Final.208198c, netty-transport-rxtx=netty-transport-rxtx-4.0.23.Final.208198c, netty-transport-sctp=netty-transport-sctp-4.0.23.Final.208198c, netty-transport-udt=netty-transport-udt-4.0.23.Final.208198c]
INFO [main] 2016-05-29 18:54:00,813 Server.java:162 - Starting listening for CQL clients on /10.176.65.71:9042 (unencrypted)...
INFO [main] 2016-05-29 18:54:00,850 CassandraDaemon.java:471 - Not starting RPC server as requested. Use JMX (StorageService->startRPCServer()) or nodetool (enablethrift) to start it
[cassandra-node-2]$ cat /var/log/cassandra/system.log
INFO [HANDSHAKE-/10.176.65.71] 2016-05-29 18:53:52,420 OutboundTcpConnection.java:514 - Handshaking version with /10.176.65.71
INFO [GossipStage:1] 2016-05-29 18:53:52,562 Gossiper.java:1028 - Node /10.176.65.71 has restarted, now UP
INFO [GossipStage:1] 2016-05-29 18:53:52,576 TokenMetadata.java:429 - Updating topology for /10.176.65.71
INFO [GossipStage:1] 2016-05-29 18:53:52,577 TokenMetadata.java:429 - Updating topology for /10.176.65.71
INFO [HANDSHAKE-/10.176.65.71] 2016-05-29 18:53:52,589 OutboundTcpConnection.java:514 - Handshaking version with /10.176.65.71
INFO [SharedPool-Worker-1] 2016-05-29 18:53:52,626 Gossiper.java:994 - InetAddress /10.176.65.71 is now UP
INFO [SharedPool-Worker-1] 2016-05-29 18:53:52,679 Gossiper.java:994 - InetAddress /10.176.65.71 is now UP
INFO [SharedPool-Worker-2] 2016-05-29 18:53:52,679 Gossiper.java:994 - InetAddress /10.176.65.71 is now UP
INFO [SharedPool-Worker-1] 2016-05-29 18:53:52,679 Gossiper.java:994 - InetAddress /10.176.65.71 is now UP
INFO [GossipStage:1] 2016-05-29 18:53:52,864 StorageService.java:2081 - Node /10.176.65.71 state jump to NORMAL
The few important options are the cluster name, enabling authentication and authorization, specifying the IP of the seed node and the Snitch. In this example I am using the GossipingPropertyFileSnitch, in which you specify the DC and rack each node is in. As long as the nodes are configured with the same cluster name they'll discover each other and form the ring utilizing the gossip protocol.
Managing a Cassandra cluster can be accomplished by using the nodetool utility. Here are few examples of removing a live node, then re-adding it back to the cluster:
File: gistfile1.txt
-------------------
[cassandra-node-2]$ nodetool -h localhost status
Datacenter: iad3
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.176.65.71 283.62 KB 256 100.0% 50a36028-391b-4aad-a4b4-3cf2625f2211 rack1
UN 10.176.64.41 243.39 KB 256 100.0% 4b8377bf-4f85-415f-acae-575ee2cd69dd rack1
[cassandra-node-2]$ nodetool info
ID : 50a36028-391b-4aad-a4b4-3cf2625f2211
Gossip active : true
Thrift active : false
Native Transport active: true
Load : 253.09 KB
Generation No : 1464548032
Uptime (seconds) : 1445
Heap Memory (MB) : 156.59 / 978.00
Off Heap Memory (MB) : 0.00
Data Center : iad3
Rack : rack1
Exceptions : 0
Key Cache : entries 20, size 1.66 KB, capacity 48 MB, 81 hits, 115 requests, 0.704 recent hit rate, 14400 save period in seconds
Row Cache : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds
Counter Cache : entries 0, size 0 bytes, capacity 24 MB, 0 hits, 0 requests, NaN recent hit rate, 7200 save period in seconds
Token : (invoke with -T/--tokens to see all 256 tokens)
[cassandra-node-2]$ nodetool describecluster
Cluster Information:
Name: TestCluster
Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
34762ade-095c-3c37-8a04-4b6546170c78: [10.176.65.71, 10.176.64.41]
[cassandra-node-2]$ nodetool -h localhost decommission
[cassandra-node-2]$ nodetool -h localhost status
Datacenter: iad3
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.176.64.41 263.69 KB 256 100.0% 4b8377bf-4f85-415f-acae-575ee2cd69dd rack1
[cassandra-node-2]$ /etc/init.d/cassandra stop
[cassandra-node-2]$ rm -rf /var/lib/cassandra/data/*
[cassandra-node-2]$ vim /etc/cassandra/jvm.options
-Dcassandra.replace_address=10.176.65.71
[cassandra-node-2]$ /etc/init.d/cassandra start
[cassandra-node-1]$ tail -50 /var/log/cassandra/system.log
INFO [GossipStage:1] 2016-05-29 19:30:57,526 Gossiper.java:1009 - InetAddress /10.176.65.71 is now DOWN
INFO [HANDSHAKE-/10.176.65.71] 2016-05-29 19:30:57,531 OutboundTcpConnection.java:514 - Handshaking version with /10.176.65.71
INFO [HANDSHAKE-/10.176.65.71] 2016-05-29 19:30:58,126 OutboundTcpConnection.java:514 - Handshaking version with /10.176.65.71
INFO [STREAM-INIT-/10.176.65.71:39575] 2016-05-29 19:31:29,617 StreamResultFuture.java:114 - [Stream #f0f8c060-25d3-11e6-935c-d55c35b994bb ID#0] Creating new streaming plan for Bootstrap
INFO [STREAM-INIT-/10.176.65.71:39575] 2016-05-29 19:31:29,618 StreamResultFuture.java:121 - [Stream #f0f8c060-25d3-11e6-935c-d55c35b994bb, ID#0] Received streaming plan for Bootstrap
INFO [STREAM-INIT-/10.176.65.71:57654] 2016-05-29 19:31:29,618 StreamResultFuture.java:121 - [Stream #f0f8c060-25d3-11e6-935c-d55c35b994bb, ID#0] Received streaming plan for Bootstrap
INFO [STREAM-IN-/10.176.65.71] 2016-05-29 19:31:29,702 StreamResultFuture.java:185 - [Stream #f0f8c060-25d3-11e6-935c-d55c35b994bb] Session with /10.176.65.71 is complete
INFO [STREAM-IN-/10.176.65.71] 2016-05-29 19:31:29,702 StreamResultFuture.java:217 - [Stream #f0f8c060-25d3-11e6-935c-d55c35b994bb] All sessions completed
INFO [SharedPool-Worker-1] 2016-05-29 19:31:30,162 Gossiper.java:994 - InetAddress /10.176.65.71 is now UP
[cassandra-node-2]$ tail -50 /var/log/cassandra/system.log
INFO [pool-2-thread-1] 2016-05-29 19:30:54,741 AutoSavingCache.java:165 - Completed loading (41 ms; 4 keys) KeyCache cache
INFO [main] 2016-05-29 19:30:54,754 CommitLog.java:171 - Replaying /var/lib/cassandra/commitlog/CommitLog-6-1464548029096.log, /var/lib/cassandra/commitlog/CommitLog-6-1464548029097.log
INFO [main] 2016-05-29 19:30:55,744 CommitLog.java:173 - Log replay complete, 168 replayed mutations
INFO [main] 2016-05-29 19:30:55,745 StorageService.java:600 - Populating token metadata from system tables
INFO [main] 2016-05-29 19:30:55,776 StorageService.java:607 - Token metadata: Normal Tokens:
/10.176.64.41:[-9093342176872828671, ... ]
INFO [main] 2016-05-29 19:30:55,877 StorageService.java:618 - Cassandra version: 3.5
INFO [main] 2016-05-29 19:30:55,894 StorageService.java:619 - Thrift API version: 20.1.0
INFO [main] 2016-05-29 19:30:55,894 StorageService.java:620 - CQL supported versions: 3.4.0 (default: 3.4.0)
INFO [main] 2016-05-29 19:30:55,967 IndexSummaryManager.java:85 - Initializing index summary manager with a memory pool size of 48 MB and a resize interval of 60 minutes
INFO [main] 2016-05-29 19:30:55,968 StorageService.java:639 - Loading persisted ring state
INFO [main] 2016-05-29 19:30:55,990 StorageService.java:522 - Gathering node replacement information for /10.176.65.71
INFO [main] 2016-05-29 19:30:55,995 MessagingService.java:557 - Starting Messaging Service on /10.176.65.71:7000 (eth1)
INFO [MessagingService-Incoming-/10.176.64.41] 2016-05-29 19:30:56,041 ApproximateTime.java:44 - Scheduling approximate time-check task with a precision of 10 milliseconds
INFO [HANDSHAKE-/10.176.64.41] 2016-05-29 19:30:56,053 OutboundTcpConnection.java:514 - Handshaking version with /10.176.64.41
INFO [GossipStage:1] 2016-05-29 19:30:56,097 Gossiper.java:1028 - Node /10.176.64.41 has restarted, now UP
INFO [GossipStage:1] 2016-05-29 19:30:56,103 Gossiper.java:1009 - InetAddress /10.176.65.71 is now DOWN
INFO [SharedPool-Worker-1] 2016-05-29 19:30:56,107 Gossiper.java:994 - InetAddress /10.176.64.41 is now UP
INFO [ScheduledTasks:1] 2016-05-29 19:30:56,280 TokenMetadata.java:448 - Updating topology for all endpoints that have changed
INFO [main] 2016-05-29 19:30:57,052 StorageService.java:828 - Starting up server gossip
INFO [main] 2016-05-29 19:30:57,195 StorageService.java:1323 - JOINING: waiting for ring information
INFO [GossipStage:1] 2016-05-29 19:30:58,119 Gossiper.java:1030 - Node /10.176.64.41 is now part of the cluster
INFO [GossipStage:1] 2016-05-29 19:30:58,121 StorageService.java:2081 - Node /10.176.64.41 state jump to NORMAL
INFO [SharedPool-Worker-1] 2016-05-29 19:30:58,135 Gossiper.java:994 - InetAddress /10.176.64.41 is now UP
INFO [HANDSHAKE-/10.176.64.41] 2016-05-29 19:30:58,142 OutboundTcpConnection.java:514 - Handshaking version with /10.176.64.41
INFO [GossipStage:1] 2016-05-29 19:30:58,146 TokenMetadata.java:429 - Updating topology for /10.176.64.41
INFO [GossipStage:1] 2016-05-29 19:30:58,147 TokenMetadata.java:429 - Updating topology for /10.176.64.41
INFO [InternalResponseStage:1] 2016-05-29 19:30:58,510 ColumnFamilyStore.java:395 - Initializing system_traces.events
INFO [InternalResponseStage:1] 2016-05-29 19:30:58,517 ColumnFamilyStore.java:395 - Initializing system_traces.sessions
INFO [InternalResponseStage:1] 2016-05-29 19:30:58,534 ColumnFamilyStore.java:395 - Initializing system_distributed.parent_repair_history
INFO [InternalResponseStage:1] 2016-05-29 19:30:58,540 ColumnFamilyStore.java:395 - Initializing system_distributed.repair_history
INFO [InternalResponseStage:1] 2016-05-29 19:30:58,557 ColumnFamilyStore.java:395 - Initializing system_auth.resource_role_permissons_index
INFO [InternalResponseStage:1] 2016-05-29 19:30:58,563 ColumnFamilyStore.java:395 - Initializing system_auth.role_members
INFO [InternalResponseStage:1] 2016-05-29 19:30:58,570 ColumnFamilyStore.java:395 - Initializing system_auth.role_permissions
INFO [InternalResponseStage:1] 2016-05-29 19:30:58,584 ColumnFamilyStore.java:395 - Initializing system_auth.roles
WARN [GossipTasks:1] 2016-05-29 19:30:59,118 FailureDetector.java:287 - Not marking nodes down due to local pause of 6834867440 > 5000000000
INFO [main] 2016-05-29 19:30:59,196 StorageService.java:1323 - JOINING: waiting for schema information to complete
INFO [main] 2016-05-29 19:30:59,197 StorageService.java:1323 - JOINING: schema complete, ready to bootstrap
INFO [main] 2016-05-29 19:30:59,197 StorageService.java:1323 - JOINING: waiting for pending range calculation
INFO [main] 2016-05-29 19:30:59,197 StorageService.java:1323 - JOINING: calculation complete, ready to bootstrap
INFO [main] 2016-05-29 19:31:29,198 StorageService.java:1323 - JOINING: Replacing a node with token(s): [-1101707182484276762, ... ]
INFO [main] 2016-05-29 19:31:29,236 StorageService.java:1323 - JOINING: Starting to bootstrap...
INFO [main] 2016-05-29 19:31:29,596 StreamResultFuture.java:88 - [Stream #f0f8c060-25d3-11e6-935c-d55c35b994bb] Executing streaming plan for Bootstrap
INFO [StreamConnectionEstablisher:1] 2016-05-29 19:31:29,607 StreamSession.java:237 - [Stream #f0f8c060-25d3-11e6-935c-d55c35b994bb] Starting streaming to /10.176.64.41
INFO [StreamConnectionEstablisher:1] 2016-05-29 19:31:29,619 StreamCoordinator.java:264 - [Stream #f0f8c060-25d3-11e6-935c-d55c35b994bb, ID#0] Beginning stream session with /10.176.64.41
INFO [STREAM-IN-/10.176.64.41] 2016-05-29 19:31:29,697 StreamResultFuture.java:185 - [Stream #f0f8c060-25d3-11e6-935c-d55c35b994bb] Session with /10.176.64.41 is complete
INFO [STREAM-IN-/10.176.64.41] 2016-05-29 19:31:29,726 StreamResultFuture.java:217 - [Stream #f0f8c060-25d3-11e6-935c-d55c35b994bb] All sessions completed
INFO [STREAM-IN-/10.176.64.41] 2016-05-29 19:31:29,756 StorageService.java:1376 - Bootstrap completed! for the tokens [-1101707182484276762, ... ]
INFO [main] 2016-05-29 19:31:29,775 StorageService.java:2081 - Node /10.176.65.71 state jump to NORMAL
WARN [main] 2016-05-29 19:31:29,776 StorageService.java:2093 - Not updating token metadata for /10.176.65.71 because I am replacing it
INFO [main] 2016-05-29 19:31:29,788 AuthCache.java:172 - (Re)initializing CredentialsCache (validity period/update interval/max entries) (2000/2000/1000)
INFO [main] 2016-05-29 19:31:29,792 CassandraDaemon.java:639 - Waiting for gossip to settle before accepting client requests...
INFO [main] 2016-05-29 19:31:37,794 CassandraDaemon.java:670 - No gossip backlog; proceeding
INFO [main] 2016-05-29 19:31:37,855 NativeTransportService.java:70 - Netty using native Epoll event loop
INFO [main] 2016-05-29 19:31:37,903 Server.java:161 - Using Netty Version: [netty-buffer=netty-buffer-4.0.23.Final.208198c, netty-codec=netty-codec-4.0.23.Final.208198c, netty-codec-http=netty-codec-http-4.0.23.Final.208198c, netty-codec-socks=netty-codec-socks-4.0.23.Final.208198c, netty-common=netty-common-4.0.23.Final.208198c, netty-handler=netty-handler-4.0.23.Final.208198c, netty-transport=netty-transport-4.0.23.Final.208198c, netty-transport-rxtx=netty-transport-rxtx-4.0.23.Final.208198c, netty-transport-sctp=netty-transport-sctp-4.0.23.Final.208198c, netty-transport-udt=netty-transport-udt-4.0.23.Final.208198c]
INFO [main] 2016-05-29 19:31:37,903 Server.java:162 - Starting listening for CQL clients on /10.176.65.71:9042 (unencrypted)...
INFO [main] 2016-05-29 19:31:37,943 CassandraDaemon.java:471 - Not starting RPC server as requested. Use JMX (StorageService->startRPCServer()) or nodetool (enablethrift) to start it
[cassandra-node-2]$ nodetool -h localhost status
Datacenter: iad3
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.176.65.71 176.59 KB 256 100.0% 50a36028-391b-4aad-a4b4-3cf2625f2211 rack1
UN 10.176.64.41 263.69 KB 256 100.0% 4b8377bf-4f85-415f-acae-575ee2cd69dd rack1
To take a snapshot of the data for backup run the following:
File: gistfile1.txt
-------------------
[cassandra-node-2]$ nodetool -h localhost snapshot
Requested creating snapshot(s) for [all keyspaces] with snapshot name [1464548620814] and options {skipFlush=false}
Snapshot directory: 1464548620814
[cassandra-node-2]$ ls -la /var/lib/cassandra/data/system_schema/indexes-0feb57ac311f382fba6d9024d305702f/snapshots/
total 12
drwxr-xr-x 3 cassandra cassandra 4096 May 29 19:03 .
drwxr-xr-x 4 cassandra cassandra 4096 May 29 19:03 ..
drwxr-xr-x 2 cassandra cassandra 4096 May 29 19:03 1464548620814
[cassandra-node-2]$ nodetool -h localhost clearsnapshot
Requested clearing snapshot(s) for [all keyspaces]
[cassandra-node-2]$ ls -la /var/lib/cassandra/data/system_schema/types-5a8b1ca866023f77a0459273d308917a/snapshots/
ls: cannot access /var/lib/cassandra/data/system_schema/types-5a8b1ca866023f77a0459273d308917a/snapshots/: No such file or directory
Once the snapshot is complete you can copy it to a secure location. To enable incremental backups edit cassandra.yaml and ensure "incremental_backups: true". To restore the data, stop Cassandra, clean up the data (only delete the *.db files) and commitlog directories and copy the files over.
And finally let's manipulate some data, by adding a new user, changing the default cassandra password and creating and inserting records:
File: gistfile1.txt
-------------------
[cassandra-node-1]$ cqlsh 10.176.65.71 -ucassandra -pcassandra
Connected to TestCluster at 10.176.65.71:9042.
[cqlsh 5.0.1 | Cassandra 3.5 | CQL spec 3.4.0 | Native protocol v4]
Use HELP for help.
cassandra@cqlsh>
cassandra@cqlsh> list users;
name | super
-----------+-------
cassandra | True
(1 rows)
cassandra@cqlsh> CREATE USER root WITH PASSWORD 'supersecretpassword' SUPERUSER;
cassandra@cqlsh> list users;
name | super
-----------+-------
cassandra | True
root | True
(2 rows)
cassandra@cqlsh> ALTER USER cassandra WITH PASSWORD 'supersecretpassword';
cassandra@cqlsh> describe KEYSPACES ;
system_auth system system_distributed system_traces system_schema
cassandra@cqlsh> CREATE KEYSPACE IF NOT EXISTS test_keyspace WITH replication = {'class': 'NetworkTopologyStrategy', 'iad3':2};
cassandra@cqlsh> use test_keyspace ;
cassandra@cqlsh:test_keyspace> CREATE TABLE test_table (id int PRIMARY KEY, name varchar, enabled boolean);
cassandra@cqlsh:test_keyspace> DESCRIBE table test_table;
CREATE TABLE test_keyspace.test_table (
id int PRIMARY KEY,
enabled boolean,
name text
) WITH bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
cassandra@cqlsh:test_keyspace> INSERT INTO test_table (id, name, enabled) VALUES ( 1, 'Konstantin', true );
cassandra@cqlsh:test_keyspace> select * from test_table ;
id | enabled | name
----+---------+------------
1 | True | Konstantin
(1 rows)
cassandra@cqlsh:test_keyspace> describe schema;
CREATE KEYSPACE test_keyspace WITH replication = {'class': 'NetworkTopologyStrategy', 'iad3': '2'} AND durable_writes = true;
CREATE TABLE test_keyspace.test_table (
id int PRIMARY KEY,
enabled boolean,
name text
) WITH bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
cassandra@cqlsh:test_keyspace> quit
[cassandra-node-1]$ ls -lah /var/lib/cassandra/data/test_keyspace/test_table-59271a3025d711e6935cd55c35b994bb/
total 12K
drwxr-xr-x 3 cassandra cassandra 4.0K May 29 19:55 .
drwxr-xr-x 3 cassandra cassandra 4.0K May 29 19:55 ..
drwxr-xr-x 2 cassandra cassandra 4.0K May 29 19:55 backups
[cassandra-node-1]$ nodetool flush
[cassandra-node-1]$ ls -lah /var/lib/cassandra/data/test_keyspace/test_table-59271a3025d711e6935cd55c35b994bb/
total 48K
drwxr-xr-x 3 cassandra cassandra 4.0K May 29 20:07 .
drwxr-xr-x 3 cassandra cassandra 4.0K May 29 19:55 ..
drwxr-xr-x 2 cassandra cassandra 4.0K May 29 19:55 backups
-rw-r--r-- 1 cassandra cassandra 43 May 29 20:07 ma-1-big-CompressionInfo.db
-rw-r--r-- 1 cassandra cassandra 44 May 29 20:07 ma-1-big-Data.db
-rw-r--r-- 1 cassandra cassandra 9 May 29 20:07 ma-1-big-Digest.crc32
-rw-r--r-- 1 cassandra cassandra 16 May 29 20:07 ma-1-big-Filter.db
-rw-r--r-- 1 cassandra cassandra 8 May 29 20:07 ma-1-big-Index.db
-rw-r--r-- 1 cassandra cassandra 4.6K May 29 20:07 ma-1-big-Statistics.db
-rw-r--r-- 1 cassandra cassandra 56 May 29 20:07 ma-1-big-Summary.db
-rw-r--r-- 1 cassandra cassandra 92 May 29 20:07 ma-1-big-TOC.txt
Resources:
[1]. http://cassandra.apache.org/
[2]. https://en.wikipedia.org/wiki/ACID
[3]. https://en.wikipedia.org/wiki/Eventual_consistency
[4]. https://en.wikipedia.org/wiki/CAP_theorem
[5]. docs.datastax.com/en/cassandra/2.0/cassandra/architecture/architectureIntro_c.html