Category Archives: Inverse Rest

Architecture style which scales across nodes using the same principles REST uses to scale across users. Client cache and code on demand figure prevalently to make small data and code local to big data.

Game Over for VMs, what’s next?

That includes the Java VM.  Yes, you heard it.  I’ve been writing Java since 1996 and in 2016 I can officially say that all the reasons I supported Java for all these years no longer apply.  I accurately predicted the rise of Java when the technology was literally a laughing stock, and I have stuck with it for very good reasons until now.  What’s Changed?  And more importantly: What’s Next?

FYI: this post relates to mission critical enterprise software, not desktop software

What’s Changed?

  • Back in the day, there were legions of different processors, endian issues, and not even agreement on how big a “byte” was.  Now only three viable von Neumann architectures exist:  Intel, Intel, and Intel.  The von Neumann architecture itself is dead too.  We’ll get to that.
  • In yesteryear, networks were slow so you had to move the code around to avoid moving the data across the network.  Java Applets and Hadoop are good examples of this.  Java was ideal for this because of platform independence, dynamic class loading, and dynamic compilation to native code.  Now it is actually the software which is slowing the networks today, not the other way around.  We’ll get to that.
  • In the old days, operating systems vied for superiority, spreading FUD as they went.  No one knew who would win (nail-biter).  Now there are only three operating systems vying for dominance: and they are all flavors of Linux.

What’s Next?

Spinning up a Linux container has literally almost no overhead, and yet has enterprise class resource management, security, and robustness.  The industry currently focuses on microservices as a design pattern for client side applications, however this pattern applies equally to server-side applications as well.  New Linux flavors like CoreOS and Alpine build on this concept where everything except the kernel is a microservice operating in a container.  This allows very high levels of performance, security, and efficiency in the kernel that all the other services rely upon.  These new server-side microservice platforms provide all the enterprise class deployment, management, monitoring, security, and interoperability that the Java Platform delivered 21 years ago, without the need for a virtual machine of any kind.  Server-side microservices provide both resource isolation and maximum performance at the same time: at a level which neither the Java VM nor any VM can come close to matching.  And what would be the language of choice for implementing these world changing server-side microservices?

The OS itself is implemented in C, so naturally any server-side microservice not implemented in C will have a very hard time competing with those who are.  Note that the container model completely eliminates the normal native package management hell associated with C, even to the point where an “Apple Store” for containers was recently announced by Docker.

Marketplaces like the Docker Store allow purchasing an entire server cluster pre-configured with server-side microservices of your choice on any cloud platform or even in a local bare metal data center.  The same solution also solves the cloud vendor lock-in that many companies have been struggling with.  Like I said: GAME OVER

On a final note: von Neumann microprocessors no longer fit Moore’s Law and price / performance ratios been degrading for some time now.  The data volumes and low latency requirements of the Internet-of-Things will soon place unbearable pressures on the von Neumann microprocessor model.  Programmable hardware such as FPGA have traditionally required learning different languages and complete software re-write to take advantage of programmable processor architecture.  Seymour Cray founded a company 20 years ago with this exact realization, and the graphic below says it all.  If you want to leverage the power of the Saturn FPGA cartridge, you’ll be writing your code in C, thank you very much 🙂




Watching the adoption of Native Scala is like watching the faces of the Lemmings when…

They realize all their “little friends” have been leading them off the face of a cliff.  I know it’s a bit macabre, but *can* I watch?  The Oracle-hating (read success-hating) that birthed Scala is finally coming to a head.  lemmings

By their fruits you shall know them.  The final fraudulent claim: all the trash code produced by Scala’s over-complexity will somehow run faster natively, and that the saving grace is finally getting rid of that darn JVM.  As I have said previously FYI: creating a new language does’t make your writing better.  Java used to have ahead-of-time compilation too, but that died off because dynamic run time application redefinition is more valuable in a business sense then statically compiled code.  It is easy to criticize a mature, free, and broadly adopted platform like the JVM for the minor flaws it has, and very hard to create a new native platform that actually works.  More likely Scala will loose the tiny market share it has long before this native platform kills off the stragglers.

But seriously, is it too much to hope that *all* the talent-haters and success-haters will be killed off by this one Darwin Award qualifying mass extinction event?


Hadoop Developer Training by Matt

100% PowerPoint free!  Learn at your own pace with 10 hours of video training and 15+ hours of in-depth exercises created by Matt Pouttu-Clarke and available exclusively through Harvard Innovation Labs on the Experfy Training Platform.

  • Learn to develop industrial strength MapReduce applications hands-on with tricks of the trade you will find nowhere else.
  • Apply advanced concepts such as Monte Carlo Simulations, Intelligent Hashing, Push Predicates, and Partition Pruning.
  • Learn to produce truly reusable User Defined Functions (UDFs) which stand the test of time and work seamlessly with multiple Hadoop tools and distributions.
  • Learn the latest industry best practices on how to utilize Hadoop ecosystem tools such as Hive, Pig, Flume, Sqoop, and Oozie in an Enterprise context.

Click here for more info!

Oracle SWiS DAX APIs: instant classic

The Software In Silicon Data Analytic Accelerator (SWiS DAX) APIs released by Oracle this week signify a sea change for big data and fast data analytic processing.  SWiS-DAXNatively accelerated common analytic functions usable in C, Python, and Java have already shown a 6x lift for a Spark cube building applicationApache Flink and Apache Drill completely eclipse Spark performance so it will be very interesting to see upcoming benchmarks of these higher performing frameworks on SWiS DAX.  There is nothing to keep any vendor or group from bench marking with these APIs as they will work with any C, Python, or Java application.

I’m also looking forward to testing performance of SWiS DAX on non-partitionable data sets in a big memory SMP architecture as well.  The easy problems are partitionable, and true data discovery should allow any-to-any relations without injecting a-priori partitioning assumptions.

It seems that Oracle’s long standing commitment to developing Sun’s Sparc processors is about to pay off in a very big way for big data and fast data analysts.

FYI: creating a new language does’t make your writing better

Remember back in the the day when the some kids would invent their own language like Pig Latin and then talk it amougst themselves?  And if you didn’t talk to them in Pig Latin you just weren’t cool enough to be part of the conversation.  I was more like the “why bother” kid.  Not that I couldn’t learn Pig Latin, I just didn’t see the point.  Seriously, not that much changes.

Don’t get me wrong, I’m being hard on Berkley out of love.  Several of my relatives went there during the People’s Park era.  I get that having your own language like Swift or Go provides artificial culture creation and built-in revenue protection.  That’s fine if you want to program to the Google or Apple cool-aid culture and plug right in.  Not my thing, but it takes all kinds.  It’s just that I’m not getting this whole Scala thing.  Why it exists.

Ok so I get that for a lot of people the Java music died when Scott McNealy finally sold out to Oracle.  And the whole Bill Joy thing and how that was handled…  Shameful, all.  Oracle’s attempts to profiteer and build an empire off Java with things like the OpenOffice purchase: ridiculous.  But the funny thing is, Oracle has taken a beating from their industry customers (like me) and has actually realized that being a great Java stakeholder is the best chance they have of preserving market share.  Of course they would never admit that publicly but that’s what I love about Oracle: they spend a lot more time delivering than talking about it.  They’re kind of like the quiet doer kid who thinks he’s Iron Man.

The thing I don’t get is that if you want to go functional, why not go all the way?  Why create a Java hybrid, and then keep backing it even when the parent language already added functional constructs?  Lisp has been around forever.  Heck back in the 90s I did functional programming in C with void pointers.  And Java *gained* more than twice the market share than Scala ever had between 2015 and 2016.  Talk about a moving target.

So lets take a look at a shining example of how Scala is such a more elegant and maintainable language than Java:

Spaghetti alla Spark e Scala del primo

So a couple of points about the above, which sits at the core of the Spark tool which is driving most of this Scala noise:

  1. I’ve seen better spaghetti code written in C using goto statements
  2. No knowledge of and/or respect for Gang of Four patterns, Enterprise Integration Patterns, or any other principles self-respecting programmers follow
  3. Hard coding Hadoop dependencies in because that’s for sure how I’ll store data when everything is on Non-volatile RAM

It’s not really about the language in the end: it’s about who’s writing it and the quality and integrity of what is written.

So I just want to say it clearly and definitively for all to hear: Twitter is dead wrong, Nathan Marz is right, and the coolest kids are alive and well and speaking Clojure just because they love it.  Nothing personal.

Why I tried Apache Spark, and moved on..

I tried Apache Spark, and moved on.  Here’s why:

Resourcing Issuesspark-do-not-enter

Apache Spark, written in Scala, causes severe resourcing issues for customers due to the additional technical skill requirements:

  1. Scala ranks #30 with 0.5 % in market saturation, while Java ranks #1 with 21.5% of the market, a difference of 4300%:
  2. Introduction of native Functional Programming constructs into the Java language with release 1.8 practically eliminates the business case for Scala altogether:
  3. Scala works best with IntelliJ Idea IDE, which has licensing costs and is extremely unlikely to replace free Eclipse tooling at any large company
  4. Scala is among a crowd of strong contenders and faces a moving target as Java has gained 5% in market share between 2015 and 2016.  To put this in perspective, Scala has less market share than Lisp

Consistency and Integrity Issues

Trying to get Spark to meet rigorous standards of data consistency and integrity proves difficult.  Apache Spark’s design originates from companies who consider Data Consistency and Data Integrity secondary concerns, while most industries consider these primary concerns.  For example, achieving at-most-once and at-least-once consistency from Spark requires numerous workarounds and hacks:

Dependency Hell with a Vengeance

Apache Spark (and Scala) import a huge number of transitive dependencies compared to other alternative technologies.  Programmers must master all of those dependencies in order to master Spark.  No wonder very few true experts in Spark exist in the market today.

What’s the Alternative to Spark?

For real-time in-memory processing Use Case: data grids, once the purview of blue chip commercial vendors, now have very strong open source competition.  Primary contenders include Apache Ignite and Hazelcast.

For fast SQL analytics (OLAP) Use Case: Apache Drill provides similar performance to Spark SQL with a much simpler, more efficient, and more robust footprint.  Apache Kylin from eBay looks to become a major OLAP player very quickly, although I have not used it myself.

For stream processing Use Case: Apache Beam from Google looks likely to become the de-facto streaming workhorse, unseating both Apache Flink and Spark Streaming.  Major big data vendors have already contributed Apache Beam execution engines for both Flink and Spark, before Beam even officially hit incubation.

If you try these alternative technologies, and compare to Spark, I’m sure you’ll agree that Spark isn’t worth the headache.


Hermetic Lambdas: a Solution for Big Data Dependency Collisions

Dependency collisions furnish plenty of headaches and late nights while implementing Lambda Architecture.  Java class loading containers tend toward complexity and many frameworks such as Hadoop, JCascalog, and Storm provide tons of transitive dependencies but no dependency isolation for third party Lambda code deployed in the same JVM.  So currently to benefit from running in the same JVM, all the transitive dependencies come along for the ride.  This impacts:

  • Portability: Lambda logic shared in the batch layer and the speed layer must cope with at least two different sets of dependencies.
  • Manageability: framework version upgrades may cause hard failures or other incompatibilities in Lambda logic due to changing dependency versions.
  • Performance: solving the problem by forcing JVM isolation incurs overhead from IPC or network chatter.  Likewise, utilizing scripting or dynamic languages merely hides the problem rather than solving it, while imposing additional performance penalties.
  • Complexity: developing Lambdas requires knowledge and accommodation of all the dependencies of all frameworks targeted for deployment.

Many developers and architects rightfully shy away from deploying bloated dependency container frameworks such as Spring or OSGI inside of a big data framework.  Big data and fast data provide enough complexity already.  So lets look at how we can use a simple pattern, along with basic core Java libraries, to avoid “Dependency Hell“.

Signs you may be encountering this problem include these types of exceptions occurring after deployment:

  • java.lang.NoSuchMethodError: when a method no longer exists which your code depends on
  • java.lang.ClassNotFoundException: when a class your code uses cannot be found in the deployed class path
  • java.lang.IllegalAccessException: when conflicting versions of an API mark methods or fields private

Please reference my github project for a complete working implementation along with unit tests.

Shows how the Lambda executes within the big data framework in an isolated dependency context
Lambda Execution Model

Hermetic Classloading

The Java Service Loader framework introduced in JDK 1.6 provides a way to dynamically bind an implementation class to an interface while specifying an alternate class loader.  Loading a Lamda implementation with ServiceLoader forces all code called within the context of the Lamda to use the same class loader.  This allows creating a service which supports parent-first class loading, child-first class loading, or child-only (or hermetic) class loading.  In this case, the hermetic loader prevents any possible dependency collisions.  To create a hermetic loader, we need simply utilize the built-in URLClassLoader in Java.  Our implementation jar can reside on the local file system, on a web server, or in HDFS: anywhere a URL can point to.  For the parent class loader, we specify the Java bootstrap class loader.  So we can implement a hermetic class loading pattern in one line of code:

ServiceLoader<Map> loader = ServiceLoader.load(Map.class, 
    new URLClassLoader(urls, Map.class.getClassLoader()));

Note that we intentionally avoid calling ClassLoader.getSystemClassLoader() in order to prevent the calling context (such as Hadoop or Hive) form polluting the Lambda class path.  Core packages such as java.lang and java.util use the bootstrap class loader, which only carries core Java dependencies shipped as part of the JDK.  The diagram above shows how the LambdaBoot framework fits within a big data framework such as Hadoop.

Mapping the World

In the example above, we use a Map interface to interact with the Lambda.  This allows us to avoid having a separate jar containing a Service Provider Interface (SPI).  Instead, we can subclass the Map and provide any behavior desired by intercepting get and put calls to specific keys.  By optionally returning a Future from a get or put call on the Map, we get asynchronous interaction as well if desired.  In addition, the non-sensitive Map keys can provide metadata facilities.  In the example of linear regression implemented as a Map, we use a conflicting version of Apache Commons Math not yet supported by Hadoop to calculate the regression.

Shows how the LambdaBoot framework can work with deployment tools such as Jenkins, Puppet, and Git
Lambda Deployment Model


Using a very simple pattern, we can deploy Hermetic Lambdas to any Java environment without fear of dependency collisions.  The ServiceLoader also acts as a service registry, allowing us to browse metadata about available Lambdas, dynamically load new Lamdas, or update existing Lambdas.

Titan: Cassandra vs. Hazelcast persistence benchmark

10 node load test comparison using Amazon EC2 SSD-based instances.  1 billion vertices and 1 billion edges processed for each test run.  Used the titan-loadtest project to run each test.


Experiment maximizes data locality by co-locating load generation, Titan graph database, and Cassandra/Hazelcast within the same JVM instance while partitioning data across a cluster. Exploration of methods for tuning garbage collection, Titan, and Cassandra for the peer computing use case.

The following components were utilized during the experiment:

Technology Version
RHEL x64 HVM AMI 6.4
Oracle JDK x64 1.7_45
Apache Cassandra 1.2.9
Hazelcast 3.1.1
Titan 0.3.2

Each test iteration has 6 read ratio phases starting with 0% reads (100% writes) all the way up to 90% reads and 10% writes.  For all tests, the persistence implementation executes in the same JVM as Titan to avoid unnecessary context switching and serialization overhead.  Tests were conducted using an Amazon placement group to ensure instances resided on the same subnet.  The storage was formatted with 4K blocks and used the noop scheduler to improve latency.

For each phase, new vertices were added with one edge linking back to a previous vertex.  No tests of update or delete were conducted.

Please see the titan-loadtest project above for all Cassandra and Hazelcast settings and configurations used in the test.


Titan 10 node Summary
Titan 10 node Summary

Please note: the results are listed in rates of thousands of vertices per second and include the creation of an edge as well as a vertex. Also, the Hazelcast SSD x1 results used a custom flash storage module for Hazelcast developed privately so those results are not replicable without that module installed.


Hazelcast performed better than Cassandra for all tests and demonstrated one order of magnitude better performance on reads.  Surprisingly, Hazelcast slightly outperformed Cassandra for writes as well.

Large Java Heap with the G1 Collector – Part 1

Demonstration of efficient garbage collection on JVM heap sizes in excess of 128 gigabytes.  Garbage collection behavior analyzed via a mini-max optimization strategy.  Study measures maximum throughput versus minimum garbage collection pause to find optimization “sweet spot” across experimental phases.  Replicable results via well-defined open source experimental methods executable on Amazon EC2 hardware. 

Experimental Method


  • Demonstrate maximum feasible JVM size on current cloud hardware (specific to Amazon for now) using the G1 Garbage Collector (link).
  • Vary the JVM heap size (-Xmx) exponentially to find performance profile and breaking points.
  • Vary the ratio of new versus old generation objects exponentially.
  • Using in-memory workload to stress the JVM (avoids network or disk waits).
  • Produce replicable results on commodity hardware, open source operating systems, and open source tools.
  • Provide gap-free data for analysis, in spite of garbage collection pauses.

Not (Yet) in Scope

In followup to this study, subsequent efforts may include:

  • Vary the number of old objects up to maximum possible memory capacity.
  • Vary the number of processing threads to starve the concurrent G1 algorithm of CPU cycles.
  • Vary the object size.
  • Vary the field size within the object.
  • Vary the G1 JVM parameters.

Tools and Versions

  1. Amazon Linux AMI (link)
  2. One cr1.8xlarge instance (link): 244 GiB RAM, 2 x Intel Xeon E5-2670 (link)
  3. Oracle JDK 1.7_15

JVM Parameters

This experiment varied the JVM heap size but kept other parameters constant.

  • -Xmx16g, -Xmx32g, -Xmx64g, -Xmx128g, -Xmx212g
  • -XX:+UseG1GC
  • -XX:InitiatingHeapOccupancyPercent=0
  • -XX:MaxGCPauseMillis=200
  • -XX:MaxTenuringThreshold=25
  • -XX:ParallelGCThreads=32
  • -XX:ConcGCThreads=32
  • -XX:G1ReservePercent=10
  • -XX:G1HeapRegionSize=32m

Program Parameters

These parameters were kept constant for this study.

  • statInterval = 1000: the reporting interval for measures in milliseconds.
  • numThreads = 24: number of processing threads.
  • objectsPerThread = 8,000,000: number of objects to produce for each phase.
  • maxOld = 800,000: max number of old references to maintain, after which old references are overwritten with new values.

Experimental Phases

For each JVM heap size, the following phases are executed.  For each phase, after maxOld reached object references get overwritten and must be garbage collected.

  • RANDOM_WRITE_0: All of the objects produced are instant garbage, references not kept.
  • RANDOM_WRITE_1: All of the objects references kept.
  • RANDOM_WRITE_2: 1/2 of object references kept.
  • RANDOM_WRITE_4: 1/4 of object references kept.
  • RANDOM_WRITE_8: 1/8 of object references kept.

Experiment Results

Sum of Garbage Collection Pause
Sum of Garbage Collection Pause

The graph above shows the sum total of *all* garbage collection pauses across each experimental phase.

Min of Object Throughput
Min of Object Throughput (per second)

The graph above shows the *min* throughput for each second across each experimental phase.

Min of Throughput to Max of Garbage Collection Pause (per second)
Min of Throughput to Max of Garbage Collection Pause (per second)

The line chart above shows log scale throughput versus standard scale garbage collection pause across all seconds in each experimental phase.

Min of Throughput to Max of Garbage Collection Pause (per second)
Min of Throughput to Max of Garbage Collection Pause (per second)

The radar chart above shows log scale throughput versus standard scale garbage collection pause across all seconds in each experimental phase.


Java garbage collection test class:

import java.util.Date;
import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.CyclicBarrier;
import java.util.concurrent.atomic.AtomicLong;

public class GCTest {

    public static String configName = "Test";
    public static int statInterval = 1000;
    public static int numThreads = 4;
    public static int objectPerThread = 500000;
    public static int maxOld = 1000000;

    public static enum ExperimentPhase {

    public static int phaseIdx;
    public static ExperimentPhase phase;

    public static CyclicBarrier barrier;

    public static int[] oldToYoungRatios = new int[] { 0, 8, 4, 2, 1 };

    public static volatile boolean done = false;

    public static AtomicLong totalOfInterval = new AtomicLong();

    public static Object[] oldArr = new Object[maxOld];

    public static ConcurrentHashMap<Integer, Map<String, Object>> oldMap = new ConcurrentHashMap<Integer, Map<String, Object>>(

     * Generates maximum amount of objects with nested references. Saves some
     * references so they go to the old generation.
     * @author pouttum
    public static class GarbageThread extends Thread {

        protected int threadNo;

        public GarbageThread(int threadNo) {
            this.threadNo = threadNo;

        public void run() {


            // Incremental update phase
            for (int x = 0; x < maxOld; x++) {
                if (x % numThreads == threadNo) {
                    oldArr[x] = getDoc(x);


            // Incremental clear phase
            for (int x = 0; x < maxOld; x++) {
                if (x % numThreads == threadNo) {
                    oldArr[x] = null;

            // Random write / update phase
            for (int r = 0; r < oldToYoungRatios.length; r++) {
                for (int x = 0; x < objectPerThread; x++) {
                    Map<String, Object> doc = getDoc(x);
                    if (oldToYoungRatios[r] > 0
                            && (oldToYoungRatios[r] == 1 || (x
                                    % oldToYoungRatios[r] == 0))) {
                        int index = (int) (Math.ceil(random() * maxOld));
                        oldMap.put(index, doc);


            // Random read phase
            for (int x = 0; x < objectPerThread; x++) {
                int index = (int) (Math.ceil(random() * maxOld));

        protected void await() {
            try {
            } catch (Exception e) {

        protected HashMap<String, Object> getDoc(int x) {
            HashMap<String, Object> doc = new HashMap<String, Object>();
            doc.put("value1", "value1" + String.valueOf(x));
            doc.put("value2", "value2" + String.valueOf(x));
            doc.put("value3", "value3" + String.valueOf(x));
            doc.put("value4", "value4" + String.valueOf(x));
            doc.put("value5", "value5" + String.valueOf(x));
            doc.put("value6", "value6" + String.valueOf(x));
            doc.put("value7", "value7" + String.valueOf(x));
            doc.put("value8", "value8" + String.valueOf(x));
            doc.put("value9", "value9" + String.valueOf(x));
            return doc;

        protected double random() {
            return Math.random();


     * Calculates ongoing stats and keeps history on the stat interval.
     * @author pouttum
    public static class StatThread extends Thread {

        public void run() {
            Date previousDate = new Date();
            long adjStatInterval = statInterval;
            int intervalCount = 0;
            do {
                try {
                } catch (InterruptedException e) {
                    done = true;
                adjStatInterval = statInterval;
                long intervalTotal = totalOfInterval.getAndSet(0L);
                Date date = new Date();
                double intervalSeconds = (date.getTime() - previousDate
                        .getTime()) / 1000d;

                StringBuilder stats = new StringBuilder(1024);
                float statIntervalDouble = statInterval / 1000f;
                double gcPause = intervalSeconds - statIntervalDouble;
                if (intervalSeconds > statIntervalDouble * 2) {
                    double x = statIntervalDouble * 2;
                    for (; x < intervalSeconds; x += statIntervalDouble) {
                                configName, phase, ++intervalCount, 0,
                        gcPause -= statIntervalDouble;
                if (gcPause > 0.0d) { // Credit the next interval with some of
                                        // the count of this interval
                    adjStatInterval -= gcPause * 1000L;
                    long intervalTotalAdj = Math
                            .round((gcPause / statIntervalDouble)
                                    * intervalTotal);
                    intervalTotal -= intervalTotalAdj;
                        configName, phase, ++intervalCount, intervalTotal,
                        Math.max(gcPause, 0.0d)));
                previousDate = date;
            } while (!done);


     * * @param args
     * @throws InterruptedException
     * */
    public static void main(String[] args) throws Exception {
        if (args.length == 5) {
            configName = args[0];
            statInterval = Integer.parseInt(args[1]);
            numThreads = Integer.parseInt(args[2]);
            objectPerThread = Integer.parseInt(args[3]);
            maxOld = Integer.parseInt(args[4]);
        barrier = new CyclicBarrier(numThreads,
                new Runnable() {
                    public void run() {
                        phase = ExperimentPhase.values()[phaseIdx++];
        GarbageThread[] threads = new GarbageThread[numThreads];
        for (int x = 0; x < threads.length; x++) {
            threads[x] = new GarbageThread(x);
        StatThread statThread = new StatThread();
        for (int x = 0; x < threads.length; x++) {
            try {
            } catch (InterruptedException e) {
        done = true;


Compile the code:


Unix script

jdk1.7.0_15/bin/java -server -Xms16g -Xmx16g -XX:+UseG1GC -XX:InitiatingHeapOccupancyPercent=0 -XX:MaxGCPauseMillis=200 -XX:MaxTenuringThreshold=25 -XX:ParallelGCThreads=32 -XX:ConcGCThreads=32 -XX:G1ReservePercent=10 -XX:G1HeapRegionSize=32m GCTest mxg16 1000 24 8000000 800000 > baseline.csv
jdk1.7.0_15/bin/java -server -Xms32g -Xmx32g -XX:+UseG1GC -XX:InitiatingHeapOccupancyPercent=0 -XX:MaxGCPauseMillis=200 -XX:MaxTenuringThreshold=25 -XX:ParallelGCThreads=32 -XX:ConcGCThreads=32 -XX:G1ReservePercent=10 -XX:G1HeapRegionSize=32m GCTest mxg32 1000 24 8000000 800000 >> baseline.csv
jdk1.7.0_15/bin/java -server -Xms64g -Xmx64g -XX:+UseG1GC -XX:InitiatingHeapOccupancyPercent=0 -XX:MaxGCPauseMillis=200 -XX:MaxTenuringThreshold=25 -XX:ParallelGCThreads=32 -XX:ConcGCThreads=32 -XX:G1ReservePercent=10 -XX:G1HeapRegionSize=32m GCTest mxg64 1000 24 8000000 800000 >> baseline.csv
jdk1.7.0_15/bin/java -server -Xms128g -Xmx128g -XX:+UseG1GC -XX:InitiatingHeapOccupancyPercent=0 -XX:MaxGCPauseMillis=200 -XX:MaxTenuringThreshold=25 -XX:ParallelGCThreads=32 -XX:ConcGCThreads=32 -XX:G1ReservePercent=10 -XX:G1HeapRegionSize=32m GCTest mxg128 1000 24 8000000 800000 >> baseline.csv
jdk1.7.0_15/bin/java -server -Xms212g -Xmx212g -XX:+UseG1GC -XX:InitiatingHeapOccupancyPercent=0 -XX:MaxGCPauseMillis=200 -XX:MaxTenuringThreshold=25 -XX:ParallelGCThreads=32 -XX:ConcGCThreads=32 -XX:G1ReservePercent=10 -XX:G1HeapRegionSize=32m GCTest old_ratio_8 1000 24 8000000 800000 >> baseline.csv

CouchDB and Hadoop

CouchDB, a top level Apache project, utilizes a RESTful interface to serve up dynamic JSON data on an Internet scale.  CouchDB provides ACID guarantees with no locking using Multi-version Concurrency control, and also scales via Shared-nothing deployment using multi-master replication.  Data access leverages a map/reduce query model expressed in JavaScript, as detailed below.  This page provides a brief overview of CouchDB and potential utilization in a Hadoop Amazon Elastic MapReduce deployment.

CouchDB by Example

Looking at the Apache pages utilizes some useful examples of how to use CouchDB.  However, I found this cool Mu Dynamics simulator which shows how to query CouchDB by example.  I found that the simulator quickly clarifies how the Map/Reduce query processing works with concrete problems.  A much faster way to kick the tires than building and installing from scratch…

Scalability and Performance

The Apache wiki provides some pointers on CouchDB performance.  It seems that striping (RAID 0) provides the best performance improvements in the benchmarks I’ve seen, esp in environments with many users and a lot of random reads (see below).  CouchDB doesn’t support parallel query out of the box but opening multiple sockets to the same server would allow parallel access to data for a single client.

The map/reduce query capability produces materialized views automatically: featuring automated incremental refresh.  Therefore if the data doesn’t change in a view then each query does not re-query the base data.  By default the first query after a committed change to the base data triggers the incremental refresh of the view (this is configurable).

In terms of scalability, multi-master replication provides multiple data copies across concurrent machines and/or data centers.  Via the stateless RESTful interface load balancing doesn’t get any easier…  Also, sharding is not supported out of the box but solutions like Lounge and Pillow look promising for supporting larger data sets.

Cloud Deployment

Check out this blog post on how to spin up a CouchDB cluster on Amazon EC2 using Puppet. Seems an ideal candidate for Elastic Load Balancing: the ability to automatically spin up instances as demand increases should really make CouchDB live up to it’s name!

Potential Use Cases With Hadoop

Use of CouchDB to serve key/value reference data in an Inverse REST architecture seems an ideal use case.  Since we could maintain the CouchDB data using a separate GUI we could correct reference data or meta data dynamically as the job is running.  No need to wait around until a large job has crashed and burned to fix reference data problems.  Probably also we would want to have an LRU cache on the client side (in the Hadoop job) to minimize network hits.


Although it may seem counter-intuitive, leveraging the multi-layered REST caching and load balancing architecture to move small data close to big data could make a lot of sense, especially in a cloud environment designed to adapt dynamically to large spikes.