SC13 CSinParallel Workshop:

Using map-reduce to teach data-intensive scalable computing across the CS curriculum

Wednesday 1:30-5:00 pm, November 20, 2013

Dick Brown, St. Olaf College; Libby Shoop, Macalester College; Joel Adams, Calvin College

SC13 Map-Reduce Workshop Part 1 -- slides (Acrobat (PDF) 2.5MB Nov20 13)

Abstract

Map­-reduce, the cornerstone computational framework for cloud computing applications, has star appeal to draw students to the study of parallelism. Participants will carry out hands-­on exercises designed for students at CS1/intermediate/advanced levels that introduce data­intensive scalable computing concepts, using WebMapReduce (WMR), a simplified open­source interface to the widely used Hadoop map­reduce programming environment, and using Hadoop itself. These hands­-on exercises enable students to perform data­ intensive scalable computations carried out on the most widely deployed map­reduce framework, used by Facebook, Microsoft, Yahoo, and other companies. WMR supports programming in a choice of languages (including Java, Python, C++, C#, Scheme); participants will be able to try exercises with languages of their choice. Workshop includes brief introduction to direct Hadoop programming, and information about access to cluster resources supporting WMR. Workshop materials will reside on csinparallel.org, along with WMR software. Intended audience: CS instructors. Laptop required (Windows, Mac, or Linux).

Part 1 -- Fundamentals; Introductory Courses

This first half of the workshop introduces map-reduce computing through the WebMapReduce (WMR)simplified interface to Hadoop, then shares our experience teaching map-reduce and related concepts of parallel and distributed computing to students in introductory sequences.

  1. Introductory presentation (1:30pm)

  2. Hands-on exercises (2:10pm)

  3. Break (3:00pm)

    Note: A SSH client will be needed for hands-on exercise in second half. - Macintosh and Linux users -- included with standard setup - Windows users -- available applications include PuTTY and WinSCP.

Part 2 -- Intermediate and Advanced Courses

This part of the workshop uses WMR to explore use of map-reduce computing in more advanced courses, and examines the relationship between the WMR interface and the Hadoop computations it performs.

  1. More on WebMapReduce (3:30pm)

  2. Using Hadoop directly (3:45pm)
  • Applications (4:20pm)
    • Use of map-reduce in undergraduate research projects -- examples
    • "Big-data:" What is it? Map-reduce vs. databases, structured vs. unstructured data.
  • Discussion and feedback (4:35)

    Please complete our own short survey for grant assessment purposes.