Database vendors add Google's MapReduce
Greenplum and Aster Data Systems will support Google's programming technique, developed for parallel processing of large data
sets across commodity hardware
Greenplum and Aster Data Systems, two startups involved in large-scale data analysis, announced this week that their products will support MapReduce, a programming technique originally developed by Google for parallel processing of large data sets across commodity hardware.
Software developers tend to be more comfortable with languages such as Java and C++ than the database language SQL, said Mayank Bawa, co-founder and CEO of Aster, maker of a cluster database system that splits workloads into multiple discrete tiers.
[ Keep up with app dev issues and trends with InfoWorld's Fatal Exception and Strategic Developer blogs. ]
"Most developers struggle with the nuances of making a database dance well to their directions," he wrote in a blog post. "Indeed, a SQL maestro is required to perform interesting queries for data transformations (during ETL processing or Extract-Load-Transform processing) or data mining (during analytics)."
Enter MapReduce, the goal of which was to provide a "trivially parallelizable framework so that even novice developers (a.k.a interns) could write programs in a variety of languages (Java/C/C++/Perl/Python) to analyze data independent of scale," Bawa wrote.
Meanwhile, Greenplum, maker of a database it says can scale to a petabyte of information, said this week that a MapReduce framework will be part of its dataflow engine as of September.
The twin announcements brought a nod of approval from one close observer of the database world.
"On its own, MapReduce can do a lot of important work in data manipulation and analysis. Integrating it with SQL should just increase its applicability and power," wrote Curt Monash of Monash Research, on the DBMS2 blog.
"MapReduce isn't needed for tabular data management. That's been efficiently parallelized in other ways," he added. "But if you want to build non-tabular structures such as text indexes or graphs, MapReduce turns out to be a big help."
-

- COMMENTS
Technology White Papers
- Building a Highly Reliable SAN - System reliability is a vital component in Storage Area Network (SAN) design that keeps your production environment operating...
- PS Series Best Practices Deploying Microsoft® Exchange Server 2007 in an iSCSI SAN - This Technical Report describes how to deploy Exchange Server 2007 in an iSCSI SAN using PS Series storage arrays. It provides...
- An Open-Source Path To Optimal Virtualization - Looking for a virtualization strategy that offers both the flexibility and reliability to meet the demands of mixed-source...
- Make Your Enterprise More Effective - Distributed enterprises need to be able to bring their people, ideas, processes, and development partners together to save...
- iSCSI: The Rising Enterprise Star - The value proposition of iSCSI storage has always been its simplicity and low cost compared to Fibre Channel. No training...
- Data Grids and Service-Oriented Architecture - When choosing an SOA strategy, corporations must rely on solutions that ensure data availability, reliability, performance...
-
-
- Technology White Papers
- Technology White Papers E-mail Alert
-
TOP STORIES
ADDITIONAL RESOURCES

- Virtual Machines: Sun's xVM Virtualization Portfolio
- Migrating to Vista
- Turning Information Into A Competitive Advantage

- Speeding Business Innovation with Data Center Transformation
- Security and Trust: The Backbone of Doing Business over the Internet
- Forrester Data Center Automation
- World Tech Update, December 19, 2008
-
this week's roundup of tech news includes Steve Jobs skipping the ?09 Macworld...
more
- [+] Watch the Video
- World Tech Update, December 12, 2008
-
This week's roundup of tech news includes Obama's ambitious tech plan, Sony's...
more
- [+] Watch the Video












