When we last left Oracle’s big data plans, there was definitely a missing piece. Oracle’s Big Data Appliance as initially disclosed at last fall’s OpenWorld was a vague plan that appeared to be positioned primarily as an appliance that would accompany and feed data to Exadata. Oracle did specify some utilities, such as an enterprise version of the open source R statistical processing program that was designed for multithreaded execution, plus a distribution of a NoSQL database based on Oracle’s BerkeleyDB as an alternative to Apache Hive. But the emphasis appeared to be extraction and transformation of data for Exadata via Oracle’s own utilities that were optimized for its platform. With Oracle’s announcement of general availability of the big data appliance, it is filling in the blanks. As such, Oracle’s plan for Hadoop was competition, not for Cloudera (or Horto... (more)

Big Moves in Big Data: EMC's Hadoop Strategy

To date, Big Storage has been locked out of Big Data. It’s been all about direct attached storage for several reasons. First, Advanced SQL players have typically optimized architectures from data structure (using columnar), unique compression algorithms, and liberal usage of caching to juice response over hundreds of terabytes. For the NoSQL side, it’s been about cheap, cheap, cheap along the Internet data center model: have lots of commodity stuff and scale it out. Hadoop was engineered exactly for such an architecture; rather than speed, it was optimized for sheer linear scale.... (more)

Informatica's Stretch Goal

Informatica is within a year or two of becoming a $1 billion company, and the CEO’s stretch goal is to get to $3b. Informatica has been on a decent tear. It’s had a string of roughly 30 consecutive growth quarters, growth over the last 6 years averaging 20%, and 2011 revenues nearing $800 million. Abbasi took charge back in 2004, lifting Informatica out of its midlife crisis by ditching an abortive foray into analytic applications, instead expanding from the company’s data transformation roots to data integration. Getting the company to its current level came largely through a seri... (more)

Another Vote for the Apache Hadoop Stack

As we’ve noted previously, the measure of success of an open source stack is the degree to which the target remains intact. That either comes as part of a captive open source project, where a vendor unilaterally open sources their code (typically hosting the project) to promote adoption, or a community model where a neutral industry body hosts the project and gains support from a diverse cross section of vendors and advanced developers. In that case, the goal is getting the formal standard to also become the de facto standard. The most successful open source projects are those t... (more)

Big Data Consolidation Enters Home Stretch as Teradata Buys Aster Data

At this point, probably at least 90 percent or more of analytic systems/data warehouses are easily contained within the SQL-based technologies that are commercially available today. We’ll take that argument a step further: Most enterprise data warehouses are less than 5 terabytes. So why then all the excitement about big data, and why are acquisitions in this field becoming almost a biweekly thing? To refresh the memory, barely a couple weeks back, HP announced its intention to buy Vertica. And this morning came the news that Teradata is buying the other 89 percent of Aster Data ... (more)