- SQL commands are not used as query API (Examples of APIs used include JSON, BSON, etc.)
- Doesn't guarantee atomic operations.
- Distributed and horizontally scalable.
- It doesn't have to predefine schemas. (Non-Schema)
- Non-tabular data storing (eg; key-value, object, graphs, etc).
Map-Reduce in Action
- The MapReduce engine may invoke reduce functions iteratively; thus; these functions must be idempotent. That is, the following must hold for your reduce function:
for all k,vals : reduce( k, [reduce(k,vals)] ) == reduce(k,vals)
- Currently, the return value from a reduce function cannot be an array (it's typically an object or a number)
- If you need to perform an operation only once, use a finalize function.
|Obama's Speech in 2009|
All code used in this article can be download here.
My next posts will be about performance evaluation on machine learning techniques. Wait for news!