yahoo_logo_2010.jpg Reports are circulating that Yahoo (news, site) will finally be announcing its commercial Hadoop spinoff company to compete with Cloudera (news, site). In a nod to the competition, Yahoo is supposed to call the new company HortonWorks, for the elephant in the Dr. Seuss classic, Horton Hears a Who.

Hadoop Returns to Its Origins

As the amount of data being produced has grown from megabytes to zettabytes, big data processing framework Hadoop has increasingly grown in popularity and created a new market for Hadoop services, products and distributions with one of the largest providers being Cloudera. Yahoo hopes to position HortonWorks as the preferred vendor for production Hadoop distributions and make the Apache distribution the preferred distribution for the core software. We reported back in May that Yahoo stopped development on its own Hadoop distribution and instead focusing its effort on development of the Apache distribution.

Yahoo created Hadoop technology and has contributed 70% of the code base to the Apache project -- although the project’s originator, Doug Cutting, now works for Cloudera. Yahoo’s entry into market is a natural move for the company and the company’s entry into the market should have a large impact on market evolution. The spinoff, HortonWorks, will consist of a small number of Yahoo engineers who will concentrate on creating a production-ready release based on Apache Hadoop. The HortonWorks Hadoop distribution will include new features, such as management tools, on top of the core aimed at making Hadoop easier to use.  It is believed that the value-added tools developed by HortonWorks will be open source and Yahoo will work closely with Apache as the product develops.

The Hadoop Market

The Hadoop market has grown substantially. There seems to be a constant stream of new entrants offering new services such as DataStax (news, site) and EMC. However, Yahoo’s entry into the commercial Hadoop space will definitely increase the competition. The Hadoop market is still relatively small in spite of growing levels of adoption. Although the market is still somewhat small, many project it will grow into a multi-billion dollar opportunity with growing levels of competition from new big data alternatives.

The announcement of the Yahoo spinoff should come today or tomorrow if reports are correct. How do you think the entry of Yahoo into the Hadoop space will affect the market? We would love to hear your thoughts.