** 0.5.1
Test Cluster:
DELL 2950 1*CPU Intel Xeon 5310 (4 cores)
5 nodes
1 node: 2GB heap for Cassandra JVM
4 nodes: 4GB heap for Cassandra JVM
Commit-log and Data stored on same disks.
25 client threads run on 5 nodes.
Data Model:
Keyspace Name = “Test”
Column Family Name = “ABC”
CompareWith for Column = LongType
Column Name = Timestamp (LongType), Value = 400 bytes binary
Billions of keys, thousands of columns.
Partitioner = dht.RandomPartitioner
MemtableSizeInMB = 64MB
ReplicationFactor = 3
Use Thrift Client Interface
Client.insert(..)
Consistency Level (write) = 1
Total inserted 1,076,333,461 columns.
Disk Use: 302GB+283GB+335GB+186GB+276GB=1,382GB (~~400B*1G=400GB *3= 1200GB)
On inserting: 1000 SSTables on each node. The latency of a query is about 1~3 seconds.
Quiet for long time: 10 SSTables (very big files, such as there is one 144GB SSTable data file)
The latency of a query is in ms.
Result: 18,000 columns/second
** 0.6.0
Only 4 nodes.
JVM GC for big heap.
Memory, GC..., always to be the bottleneck and big issue of java-based infrastructure software!
https://issues.apache.org/jira/browse/CASSANDRA-896 (LinkedBlockingQueue issue, fixed in jdk-6u19)
Seems 0.5.1 performed better.
0.6.0 eat more memory.
Cassandra 0.6.0 insert throughput
View more presentations from Schubert Zhang.