Monday, February 6, 2012

Solving the Data Problem in a Big Way

I recently joined Hortonworks as VP of Corporate Strategy, and I wanted to share my thoughts as to what attracted me to Hortonworks.

For me, it’s important to 1) work with a top-notch team and 2) focus on unique market-changing business opportunities.

Hortonworks has a strong team of technical founders (Eric14, Alan, Arun, Deveraj, Mahadev, Owen, Sanjay, and Suresh) doing impressive work within the Apache Hadoop community. Hortonworks also has an impressive Board of Directors that includes folks like Peter Fenton, Mike Volpi, Jay Rossiter, Rob Bearden, as well as our most recent board member Paul Cormier (Red Hat’s President of Products and Technology).

So “top-notch team”? Check!

Regarding “unique market-changing business opportunities”, the top 3 technology areas right now are arguably: Mobility, Cloud, and Big Data. Apache Hadoop is clearly a technology in the Big Data category that is enabling a new approach to data processing (both from a capabilities perspective and an economics perspective).

I’ve spent the last few years in the Cloud space (at SpringSource and VMware), and I met with many customers who loved VMware’s Cloud Application Platform vision. One of the common questions that came up, however, was:

“What are you going to do about the data problem?”

Traditional application architectures focus on moving structured data from backend datastores to the applications that need the data. Elastic Caching Platforms such as VMware’s vFabric Gemfire help with scalability and latency issues for these types of applications.

Rather than move data to applications, Hadoop provides a platform that cost effectively stores petabytes of data and enables application logic to execute directly on that data in a massively parallel manner.

I believe Hadoop provides a very compelling solution to the “data problem” since it’s explicitly designed to deal with the volume, velocity, variety and exponential scale of unstructured and semi-structured data that businesses increasingly need to deal with. Moreover, Hadoop does this within an economic model (a la commodity servers and storage) that makes the platform useful for a wide range of problems.

While 2011 was the year where a critical mass of enterprise customers and vendors began to realize the size and scope of the opportunity and value behind this Apache Hadoop phenomenon, the wave is just getting started, and I’m excited to be a part of the fun!