MapReduce

Introduction

"MapReduce
 is 
a 
software 
framework 
for 
processing
(large1) 
data
sets 
in 
a
 distributed
 fashion
 over 

several
 machines. 

The 
core 
idea
 behind
 MapReduce 
is
 mapping
 your 
data
set 
into 
a 
collection 
of 

pairs 
and 
then 
reducing
 over
 all 
pairs
with 
the 
same
 key."

Diana 
MacLean 
for 
CS448G,
2011

Use Cases

  • Distributed
 sort

  • Distributed
 search

  • Web‐link 
graph
 traversal

  • Machine 
learning

Slides

slides - pdf

slides - pptx

Last updated