SC13 CSinParallel Workshop:
Using map-reduce to teach data-intensive scalable computing across the CS curriculum
Wednesday 1:30-5:00 pm, November 20, 2013
Dick Brown, St. Olaf College; Libby Shoop, Macalester College; Joel Adams, Calvin College
SC13 Map-Reduce Workshop Part 1 -- slides (Acrobat (PDF) 2.5MB Nov20 13)
Map-reduce, the cornerstone computational framework for cloud computing applications, has star appeal to draw students to the study of parallelism. Participants will carry out hands-on exercises designed for students at CS1/intermediate/advanced levels that introduce dataintensive scalable computing concepts, using WebMapReduce (WMR), a simplified opensource interface to the widely used Hadoop mapreduce programming environment, and using Hadoop itself. These hands-on exercises enable students to perform data intensive scalable computations carried out on the most widely deployed mapreduce framework, used by Facebook, Microsoft, Yahoo, and other companies. WMR supports programming in a choice of languages (including Java, Python, C++, C#, Scheme); participants will be able to try exercises with languages of their choice. Workshop includes brief introduction to direct Hadoop programming, and information about access to cluster resources supporting WMR. Workshop materials will reside on csinparallel.org, along with WMR software. Intended audience: CS instructors. Laptop required (Windows, Mac, or Linux).
Part 1 – Fundamentals; Introductory Courses
This first half of the workshop introduces map-reduce computing through the WebMapReduce (WMR)simplified interface to Hadoop, then shares our experience teaching map-reduce and related concepts of parallel and distributed computing to students in introductory sequences.
Introductory presentation (1:30pm)
Goals of the workshop.
Demonstration of WebMapReduce (WMR)
Teaching map-reduce with WMR in the introductory sequence:
materials, teaching with frameworks, strategies, and experience
Hands-on exercises (2:10pm)
Getting started with WMR
Resources: Intro to WMR module; see Using WMR, then Counting words with WMR (Python)
Data sets on HDFS: /shared/gutenberg/CompleteShakespeare.txt, AnnaKarenina.txt, WarAndPeace.txt; /shared/gutenberg/all/group8
Alternative explorations: WMR code examples in various languages;
Note: A SSH client will be needed for hands-on exercise in second half.
- Macintosh and Linux users – included with standard setup
- Windows users – available applications include PuTTY and WinSCP.
Part 2 – Intermediate and Advanced Courses
This part of the workshop uses WMR to explore use of map-reduce computing in more advanced courses, and examines the relationship between the WMR interface and the Hadoop computations it performs.
More on WebMapReduce (3:30pm)
WebMapReduce and its architecture; obtaining and installing WMR
Examples: using WMR and map-reduce in upper-division (undergraduate) courses.
Resources: Module, Concurrency and Map-Reduce Strategies in Various Programming Languages
- Using Hadoop directly (3:45pm)
Overview of the Hadoop implementation of map-reduce.
Examples of Hadoop code
Hands-on exercises (25 min)
- Use of map-reduce in undergraduate research projects – examples
- "Big-data:" What is it? Map-reduce vs. databases, structured vs. unstructured data.
Discussion and feedback (4:35)
Please complete our own short survey for grant assessment purposes.