Randomized Sort
Tuesday, October 16, 2012
How to Quickly Evaluate Cloud Fitness
Clue #1: Are the addresses of all cluster nodes defined in a configuration file?
This is the first sign of trouble. An elastic virtual machine doesn’t have an IP address until it boots up.
Clue #2: Is shared storage required between cluster nodes?
There is no shared storage in the cloud.
Clue #3: Does the cluster use multicast?
Multicast strikes fear to the hearts of network admins. Muticast packets are problematic in traversing subnets. A layer 2 network emulated on top of layer 3 only makes the problem worse. No wonder multicast is disabled in AWS.
Clue #4: Does the cluster rely on UDP to manage cluster membership?
Your “subnet” in the cloud actually runs on an overlay network. Ever heard of routers dropping UDP packets in time of congestion?
Armed with these three clues, one can quickly filter out a lot of vendor noise in the web sphere today.
Tuesday, July 3, 2012
Jersey Unit Testing with Guice and EasyMock
and here. Jersey has a little known but powerful in-memory test framework. Using
an in-memory test container along with Guice allows you to unit test resource
lookup without the expense of full HTTP protocol handling. This post shows you
how. First, a sample software stack in Maven POM:
<properties> <guice.version>3.0</guice.version> <jersey.version>1.12</jersey.version> <easymock.version>3.1</easymock.version> </properties> <dependencies> <dependency> <groupId>com.sun.jersey</groupId> <artifactId>jersey-server</artifactId> <version>${jersey.version}</version> </dependency> <dependency> <groupId>com.sun.jersey.contribs</groupId> <artifactId>jersey-guice</artifactId> <version>${jersey.version}</version> </dependency> <dependency> <groupId>com.google.inject.extensions</groupId> <artifactId>guice-assistedinject</artifactId> <version>${guice.version}</version> </dependency> <dependency> <groupId>com.sun.jersey.jersey-test-framework</groupId> <artifactId>jersey-test-framework-inmemory</artifactId> <version>${jersey.version}</version> <scope>test</scope> </dependency> <dependency> <groupId>org.easymock</groupId> <artifactId>easymock</artifactId> <version>${easymock.version}</version> <scope>test</scope> </dependency> </dependencies>
We will now work through unit-testing a sub-resource locator example from my
last post. The resource classes are reproduced here for convinence:
// In BarResource.java file class BarResource { @GET Response get(); } // In FooResource.java file import com.google.inject.Provider; @Path("/foo") class FooResource { private final ProviderOur goal is to mock the sub-resource BarResource so we can uni-test resource lookup in FooResource without launching a full-scale HTTP client and server. How do we do this? The trick lies in Jerey's InMemoryTestContainerFactory. Unfortunately, it is not obvious that you can provide your own IoC container with this Factory. You only need to make one line change in the start() method.barProvider; @Inject FooResource(final Provider barProvider) { this.barProvider = barProvider; } @Path("bar") @Produces(MediaType.APPLICATION_JSON) public Response getBar() { // Client request /bar will will be redirected //to BarResource BarResource bar = barProvider.get(); bar.get(); } }
Change:
webApp.initiate(resourceConfig);To:
webApp.initiate(resourceConfig, new GuiceComponentProviderFactory(resourceConfig, injector));We would have preferred InMemoryTestContainerFactory to be made extensible so we can just pass-in our injector. But we make do for now by creating our own GuiceInMmoryTestContainerFactory class based on the InMemoryTestContainerFactory code with this one line change. I will only show a skeleton implementation here to save space:
public final class GuiceInMemoryTestContainerFactory implements TestContainerFactory { private final Injector injector; public GuiceInMemoryTestContainerFactory(final Injector injector) { this.injector = injector; } @Override public Class<LowLevelAppDescriptor> supports() { return LowLevelAppDescriptor.class; } @Override public TestContainer create(final URI baseUri, final AppDescriptor ad){ if (!(ad instanceof LowLevelAppDescriptor)) { throw new IllegalArgumentException( "The application descriptor must be an instance of LowLevelAppDescriptor"); } return new GuiceInMemoryTestContainer(baseUri, (LowLevelAppDescriptor) ad, injector); } /** * The class defines methods for starting/stopping an in-memory test container, * and for running tests on the container. */ private static final class GuiceInMemoryTestContainer implements TestContainer { // Copy other fields from InMemoryTestContainer here. final Injector injector; /** * Creates an instance of {@link InMemoryTestContainer} * @param baseUri URI of the application * @param ad instance of {@link LowLevelAppDescriptor} */ private GuiceInMemoryTestContainer(final URI baseUri, final LowLevelAppDescriptor ad, final Injector injector) { // Copy other statements from InMemoryTestContainer here this.injector = injector; } // Copy other methods from InMemoryTestContainer here @Override public void start() { if (!webApp.isInitiated()) { LOGGER.info("Starting low level InMemory test container"); webApp.initiate(resourceConfig, new GuiceComponentProviderFactory(resourceConfig, injector)); } } } }Now we use Jersey's test framework JerseyTest to write our unit test for FooResource.
The key elements are:
- Statically initialize a Guice injector;
- Use GuiceInMemoryTestContainer to initilize the test framework;
- Use JerseyServletModule to mock up dependencies.
public class FooResourceTest extends JerseyTest { private static Injector injector; @BeforeClass public static void init() { injector = Guice.createInjector(new MockServletModule()); } public FooResourceTest() { super(new GuiceInMemoryTestContainerFactory(injector)); } @Test public void testGetBar() { BarResource barMock = injector.getInstance(BarResource.class); barMock.get(); EasyMock.expectLastCall().andStubReturn(createMock(Response.class)); EasyMock.replay(barMock); WebResource wr = resource(); ClientResponse r = wr.path("/foo/bar").get(ClientResponse.class); assertNotNull(r.getStatus()); EasyMock.verify(barMock); } private static class MockServletModule extends JerseyServletModule { @Override protected void configureServlets() { bind(FooResource.class); } @Provides BarResource providesBarResource() { BarResource barMock = createMock(BarResource.class); return barMock; } } }Run this test and you will be on your way to test restful interactions in a
Guice-enabled POJO fashion.
Thursday, June 28, 2012
On-Demand Object Injection with Guice in Jersey
class Bar { void doSomething(); } class Foo { void process() { boolean condition; // Do something and then check condition if (condition) { // Create a new Bar instance to do something //only when condition is true Bar bar = new Bar(); bar.doSomething(); } } }In the example above, a new Bar instance is created on-demand when condition is true. But how would you do this in Guice when you are using Guice to "inject", a.k.a. create your objects? Guice is designed around the principle of eager dependency specification at the time of object construction. When an object Foo is created, all its dependendencies should have been "injected" by Guice during the object constrution phase. This kind of question is typical for a "framework" like Guice. A framework codifies a practice. Guice codifies the Factory pattern. But a framework often obfuscates idioms outside the codified pattern. So does Guice. How do you "new" an object Bar on-demand without first creating it in the constructor of the enclosing class Foo? It is actually quite easy in Guice. It is called "provider injection", i.e. injecting object factory. Guice automatically creates a provider for every object class that it injects. So assuming both Bar and Foo are injected by Guice like this:
import com.google.inject.AbstractModule; class GuiceModule extends AbstractModule { @Override protected final void configure() { bind(Bar.class); bind(Foo.class); }You can then inject a provider of Bar into Foo so you can ask Guice for a new instance of Bar whenenver you need it:
class Bar { void doSomething(); } import com.google.inject.Provider; class Foo { private final Provider<Bar> barProvider; @Inject Foo(final Provider<Bar> barProvider) { this.barProvider = barProvider; } void process() { boolean condition; // Do something and then check condition if (condition) { // Create a new Bar instance to do something //only when condition is true Bar bar = barProvider.get(); bar.doSomething(); } } }This is the technique to use when you write sub-resource locators in Jersey with
Guice as the IoC container:
public class BarResource { @GET public Response get(); } import com.google.inject.Provider; @Path("/") public class FooResource { private final Provider<BarResource> barProvider; @Inject FooResource(final Provider<BarResource> barProvider) { this.barProvider = barProvider; } @Path("bar") @Produces(MediaType.APPLICATION_JSON) public Response getBar() { // Client request /bar will will be redirected //to BarResource BarResource bar = barProvider.get(); bar.get(); } }
Monday, June 25, 2012
Poor Man's Static IP for EC2 a.k.a. Elastic Network Interface
First, check IP address binding to each network interface.
$ sudo ip address
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 02:26:69:f0:87:46 brd ff:ff:ff:ff:ff:ff
inet 10.3.1.190/24 brd 10.3.1.255 scope global eth0
inet6 fe80::26:69ff:fef0:8746/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 02:26:69:dc:cc:62 brd ff:ff:ff:ff:ff:ff
We see from the output the current network interface assignment is of the following:
eth0: 10.3.1.190
eth1: none
Therefore, the first order of business is to assign the EIN IP address to the interface eth1:
$ sudo ip address add 10.3.1.191/24 brd + dev eth1
Next, bring up the interface:
$ sudo ip link set dev eth1 up
Verify that eth1 is indeed up:
$ sudo ip address
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 02:26:69:f0:87:46 brd ff:ff:ff:ff:ff:ff
inet 10.3.1.190/24 brd 10.3.1.255 scope global eth0
inet6 fe80::26:69ff:fef0:8746/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 02:26:69:dc:cc:62 brd ff:ff:ff:ff:ff:ff
inet 10.3.1.191/24 brd 10.3.1.255 scope global eth1
inet6 fe80::26:69ff:fedc:cc62/64 scope link
valid_lft forever preferred_lft forever
Next, find out the default gateway:
$ ip route show
default via 10.3.1.1 dev eth0
10.3.1.0/24 dev eth0 proto kernel scope link src 10.3.1.190
10.3.1.0/24 dev eth1 proto kernel scope link src 10.3.1.191
The default gateway is 10.3.1.1 in the output. It is bound to the virtual gateway associated with the VPC. Since it is currently only bound to eth0, any traffic from eth1 that is destined to IP addresses outside the 10.3.1.0/24 IP block will be dropped! We need to reconfigure IP routing on the elastic instance to allow IP packets leaving eth1 to be routed through the default gateway. Here is how you do it.
First, add a new routing table called "awsein":
$ sudo echo 2 awsein >> /etc/iproute2/rt_tables
It will add a table called "awsein" to rt_tables as entry 2:
$ cat /etc/iproute2/rt_tables
#
# reserved values
#
255 local
254 main
253 default
0 unspec
#
# local
#
#1 inr.ruhep
2 awsein
Now adds a default route in the new table to use the same default gateway as the one used by eth0:
$ sudo ip route add default via 10.3.1.1 dev eth1 table awsein
$ sudo ip route flush cache
Confirm that the new route is indeed added:
$ ip route show table awsein
default via 10.3.1.1 dev eth1 metric 1000
Next, we need to create a new routing rule to trigger the default route on eth1 by its source IP. To do this, we first check existing routes:
$ ip rule
0: from all lookup local
32766: from all lookup main
32767: from all lookup default
Note the number 32766 for the rule "main". We will now add a new rule to "awsein" with a priority smaller than the one for "main".
$ sudo ip rule add from 10.3.1.191 lookup awsein prio 1000
Finally, verify the new rule configuration:
$ ip rule
0: from all lookup local
1000: from 10.3.1.191 lookup awsein
32766: from all lookup main
32767: from all lookup default
Now you can ssh into the instance using the EIN IP 10.3.1.191! Happy hacking.
Monday, January 9, 2012
ec2-bundle-vol Error "cert-ec2.pem: No such file or directory"
error reading certificate file /opt/aws/amitools/ec2/etc/ec2/amitools/cert-ec2.pem: No such file or directory - /opt/aws/amitools/ec2/etc/ec2/amitools/cert-ec2.pem
This looks like a bug in EC2 tools shipped by Amazon. An easy but tedious workaround is to launch an instance off the original Amazon AMI, i.e ami-4b814f22 and then copy over the cert-ec2.pem before running ec2-bundle-vol .
This problem has been reported here. Hope Amazon will devise a fix soon to save users from this misery.
Sunday, December 11, 2011
Fixing libssl and libcrypto Errors in Datastax OpsCenter Startup
The AWS Linux AMI I use has openssl 1.0.0 but DataStax OpsCenter 1.3.1 requires version 0.9.8 of libssl and libcrypto. Why didn't they say so in the docs?? The worst customer experience you can give to your user base is to let your software blow up at startup like this:
Failed to load application: libcrypto.so.0.9.8: cannot open shared object file: No such file or directory
This problem was apparently reported a month ago:
http://www.datastax.com/support-forums/topic/issue-starting-opscenterd-service
But no action has been taken to correct it...sigh...
Here is how we can fix it temporarily on our own before the Cassandra devs get their acts together:
1) Install openssl 0.9.8
sudo yum install openssl098e-0.9.8e-17.7.amzn1.i686
2) Change to /usr/lib and manually create following two symbolic links:
sudo ln -s libssl.so.0.9.8e libssl.so.0.9.8
sudo ln -s libcrypto.so.0.9.8e libcrypto.so.0.9.8
Now OpsCenter will start without the dreaded ssl error.
Monday, November 21, 2011
Cassandra Range Query Using CompositeType
CompositeType is a powerful technique to create indices using regular column families instead of super families. But there is a dearth of information on how to use CompositeType in Cassandra. Introduced in 0.8.1 in May 2011 , it is a relatively new comer to Cassandra. It doesn't help that it is not even in the "official" datatype documentation on Casandra 1.0 and 0.8! This article pieces together various tidbits to bring you a complete how-to guide on programming CompositeType. The code examples will use Hector.
Let's say we want to define a column family as the following:
row key: string
column key: composite of an integer and a string
column value: string
We can define the following schema on the cli:
create column family MyCF with comparator = 'CompositeType(IntegerType,UTF8Type)' and key_validation_class = 'UTF8Type' and default_validation_class = 'UTF8Type';
We can also define the same schema programmatically in Hector:
// Step 1: Create a cluster CassandraHostConfigurator chc = new CassandraHostConfigurator("localhost"); Cluster cluster = HFactory.getOrCreateCluster( "Test Cluster", chc); // Step 2: Create the schema ColumnFamilyDefinition myCfd = HFactory.createColumnFamilyDefinition( "MyKS", "MyCF", ComparatorType.COMPOSITETYPE); // Thanks to Shane Perry for this tip. // http://groups.google.com/group/hector-users/ // browse_thread/thread/ffd0895a17c7b43e) myCfd.setComparatorTypeAlias("(IntegerType, UTF8Type)"); myCfd.setKeyValidationClass(UTF8Type.class.getName()); myCfd.setDefaultValidationClass(UTF8Type.class.getName()); KeyspaceDefinition myKs = HFactory.createKeyspaceDefinition( "MyKS", ThriftKsDef.DEF_STRATEGY_CLASS, 1, Arrays.asList(myCfd)); // Step 3: Add schema to the cluster cluster.addKeyspace(myKs, true); KeySpace ks = HFactory.createKeyspace(myKs, cluster);
Now let's insert a single row with 2 columns:
String rowKey = "row1"; // First column key Composite colKey1 = new Composite(); colKey1.addComponent(1, IntegerSerializer.get()); colKey1.addComponent("c1", StringSerializer.get()); // Second column key Composite colKey2 = new Composite(); colKey2.addComponent(2, IntegerSerializer.get()); colKey2.addComponent("c2", StringSerializer.get()); // Insert both columns into row1 at once Mutator<String> m = HFactory.createMutator(ks, LongSerializer.get()); m.addInsertion(rowKey, "MyCF", HFactory.createColumn(colKey1, "foo", new CompositeSerializer(), StringSerializer.get())); m.addInsertion(rowKey, "MyCF", HFactory.createColumn(colKey2, "bar", new CompositeSerializer(), StringSerializer.get())); m.execute();
After the insertion, the column family should look like this table:
row1 | {1, c1} | {2, c2} |
foo | bar |
Now let's retrieve the first column using a slice query on only the first integer component of composite column key. Since Cassandra orders composite keys by components in each composite, we can construct a search range from {0, "a"} to {1, "\uFFFF} which will include {1, "c1"} but not {2, "c2"}.
SliceQuery<String, Composite, String> sq = HFactory.createSliceQuery(ks, StringSerializer(), new CompositeSerializer(), StringSerializer()); sq.setColumnFamily("MyCF"); sq.setKey("row1"); // Create a composite search range Composite start = new Composite(); start.addComponent(0, IntegerSerializer.get()); start.addComponent("a", StringSerliazer.get()); Composite finish = new Composite(); finish.addComponent(1, IntegerSerializer.get()); finish.addComponent(Character.toString(Character.MAX_VALUE), StringSerliazer.get()); sq.setRange(start, finish, false, 100); // Now search. sq.execute(); // TODO: Parse the result to get the first columnIt is unfortunate that a JavaDoc typo in the Cassandra source code prevents tools like Eclipse from displaying documentation about CompositeType. But you can always view the source online to get the precision definition and encoding scheme of CompositeType. Reading source code has been and is still the best way of learning new features in Cassandra.