1. Skip to navigation
  2. Skip to content


blog

Helping to Build the Big Data Ecosystem for Normals

Distributed systems are hard. They are hard to set up, hard to program to, and hard to maintain. What is more, this has been the case for decades. And yet, the incredible success of the shining stars of the "Big Data" movement— Google, Amazon and Facebook— has come about directly as a result of their ability to harness massive distributed systems infrastructure for the sake of data storage and analytics. For companies looking to follow these leaders, the Hadoop open source project and its associated ecosystem of subprojects have been an incredibly powerful source of leverage.

But if Hadoop forms the kernel of this new distributed operating system*, there is still a lot of work to do before the typical Fortune 500 company can take advantage of these benefits, both with existing components and by addressing missing key functionality in enterprise workloads. 

This is why we are so excited to be investing in the sqrrl team. These guys created Apache Accumulo, a secure, extensible, and very scalable database that runs inside of the Hadoop environment much the same way that Hbase does today. Accumulo has been in production use at the NSA (where the team is from) since 2008. It has handled a workload whose size is double the largest known Hadoop cluster and orders of magnitude larger than most of the current crop of Fortune 500 Hadoop clusters.** This team knows scale and we're looking forward to seeing how this gets brought into the commercial domain.

Accumulo has joined a set of very meaningful Apache Hadoop projects so it has big shoes to fill. For its security and scalability, I believe it will turn heads as developers look to robust infrastructure for mission critical applications. But I am also equally excited about some of ways in which the architecture of the database (currently most visible in their server side iterator framework) will allow the team to push the boundaries of how analytics happen as the data is streamed into the store. 

Finally, it is fantastic to see the commitment of the team in uprooting itself and moving to the home of Big Data here in Boston. Along with my co-investor, Chris Lynch, another former entrepreneur who knows this space, we look forward to surrounding the sqrrl team with some of the most qualified talent on the planet when it comes to distributed systems and Big Data. 


* http://www.computerworld.com/s/article/9222758/The_Grill_Doug_Cutting
** http://www.facebook.com/notes/paul-yang/moving-an-elephant-large-scale-hadoop-data-migration-at-facebook/10150246275318920

Tag(s):    |   0 Comment(s)
Leave a Reply