CompositeType is a powerful technique to create indices using regular column families instead of super families. But there is a dearth of information on how to use CompositeType in Cassandra. Introduced in 0.8.1 in May 2011 , it is a relatively new comer to Cassandra. It doesn't help that it is not even in the "official" datatype documentation on Casandra 1.0 and 0.8! This article pieces together various tidbits to bring you a complete how-to guide on programming CompositeType. The code examples will use Hector.
Let's say we want to define a column family as the following:
row key: string
column key: composite of an integer and a string
column value: string
We can define the following schema on the cli:
create column family MyCF with comparator = 'CompositeType(IntegerType,UTF8Type)' and key_validation_class = 'UTF8Type' and default_validation_class = 'UTF8Type';
We can also define the same schema programmatically in Hector:
// Step 1: Create a cluster CassandraHostConfigurator chc = new CassandraHostConfigurator("localhost"); Cluster cluster = HFactory.getOrCreateCluster( "Test Cluster", chc); // Step 2: Create the schema ColumnFamilyDefinition myCfd = HFactory.createColumnFamilyDefinition( "MyKS", "MyCF", ComparatorType.COMPOSITETYPE); // Thanks to Shane Perry for this tip. // http://groups.google.com/group/hector-users/ // browse_thread/thread/ffd0895a17c7b43e) myCfd.setComparatorTypeAlias("(IntegerType, UTF8Type)"); myCfd.setKeyValidationClass(UTF8Type.class.getName()); myCfd.setDefaultValidationClass(UTF8Type.class.getName()); KeyspaceDefinition myKs = HFactory.createKeyspaceDefinition( "MyKS", ThriftKsDef.DEF_STRATEGY_CLASS, 1, Arrays.asList(myCfd)); // Step 3: Add schema to the cluster cluster.addKeyspace(myKs, true); KeySpace ks = HFactory.createKeyspace(myKs, cluster);
Now let's insert a single row with 2 columns:
String rowKey = "row1"; // First column key Composite colKey1 = new Composite(); colKey1.addComponent(1, IntegerSerializer.get()); colKey1.addComponent("c1", StringSerializer.get()); // Second column key Composite colKey2 = new Composite(); colKey2.addComponent(2, IntegerSerializer.get()); colKey2.addComponent("c2", StringSerializer.get()); // Insert both columns into row1 at once Mutator<String> m = HFactory.createMutator(ks, LongSerializer.get()); m.addInsertion(rowKey, "MyCF", HFactory.createColumn(colKey1, "foo", new CompositeSerializer(), StringSerializer.get())); m.addInsertion(rowKey, "MyCF", HFactory.createColumn(colKey2, "bar", new CompositeSerializer(), StringSerializer.get())); m.execute();
After the insertion, the column family should look like this table:
row1 | {1, c1} | {2, c2} |
foo | bar |
Now let's retrieve the first column using a slice query on only the first integer component of composite column key. Since Cassandra orders composite keys by components in each composite, we can construct a search range from {0, "a"} to {1, "\uFFFF} which will include {1, "c1"} but not {2, "c2"}.
SliceQuery<String, Composite, String> sq = HFactory.createSliceQuery(ks, StringSerializer(), new CompositeSerializer(), StringSerializer()); sq.setColumnFamily("MyCF"); sq.setKey("row1"); // Create a composite search range Composite start = new Composite(); start.addComponent(0, IntegerSerializer.get()); start.addComponent("a", StringSerliazer.get()); Composite finish = new Composite(); finish.addComponent(1, IntegerSerializer.get()); finish.addComponent(Character.toString(Character.MAX_VALUE), StringSerliazer.get()); sq.setRange(start, finish, false, 100); // Now search. sq.execute(); // TODO: Parse the result to get the first columnIt is unfortunate that a JavaDoc typo in the Cassandra source code prevents tools like Eclipse from displaying documentation about CompositeType. But you can always view the source online to get the precision definition and encoding scheme of CompositeType. Reading source code has been and is still the best way of learning new features in Cassandra.