Want to know more about modules?
Find out more about modules and their contents.
Have a module of your own?
Contribute to the site by submitting your own module. Your submission will be reviewed by CS In Parallel to determine what categories it should be listed under. After that process, it will become available to all viewers of this site.
The Module Collection
Possible Course Useshowing only High-Performace Computing Show all Possible Course Use
Results 1 - 14 of 14 matches
Multi-core programming with Intel's Manycore Testing Lab (using Threading Building Blocks)
Professor Richard Brown, St. Olaf College
Intel Corporation has set up a special remote system that allows faculty and students to work with computers with lots of cores, called the Manycore Testing Lab (MTL). In this lab, we will create a program that intentionally uses multi-core parallelism, upload and run it on the MTL, and explore the issues in parallelism and concurrency that arise.
Drug Design Exemplar
An important problem in the biological sciences is that of drug design: finding small molecules, called ligands, that are good candidates for use as drugs. We introduce the problem and provide several different parallel solutions, in the context of parallel program design patterns.
Elizabeth Shoop; Yu Zhao
In this module, we will learn how to create programs that intensionally use GPU to execute. To be more specific, we will learn how to solve parallel problems more efficiently by writing programs in CUDA C Programming Language and then executes them on GPUs based on CUDA architecture.
Distributed Computing Fundamentals
Message Passing Interface (MPI) is a programming model widely used for parallel programming in a cluster. Using MPI, programmers can design methods to divide large data and perform the same computing task on segments of it and then and distribute those tasks to multiple processing units within the cluster. In this module, we will learn important and common MPI functions as well as techniques used in 'distributed memory' programming on clusters of networked computers.
Message Passing Interface (MPI) is a programming model widely used for parallel programming in a cluster. NVIDIA®'s CUDA, a parallel computing platform and programming model, uses GPU for parallel computation problems. This module will explore ways to combine these two parallel computing platforms to make parallel computation more efficient.
Concept: Data Decomposition Pattern
This module consists of reading material and code examples that depict the data decomposition pattern in parallel programming, using a small-sized example of vector addition (sometimes called the "Hello, World" of parallel programming.
Visualize Numerical Integration
This is an activity with working code supplied that enables students to see how various forms of the data decomposition pattern map processing units to computations.
WMR Exemplar: UK Traffic Incidents
Using data published by the United Kingdom department of Transportation about traffic incidents, students can explore and perform analyses using map-reduce techniques.
WMR Exemplar: Flickster network data
The exercises in this module use a network of friendships on the social movie recommendation site Flixster. Students will use it to learn how to analyze networks and chain jobs, using the WebMapReduce interface.
WMR Exemplar: LastFM million-song dataset
This module demonstrates how hadoop and WMR can be used to analyze the lastFM million song dataset. It incorporates several advanced hadoop techniques such as job chaining and multiple input.
Instructor Example: Optimizing CUDA for GPU Architecture
This module, designed for instructors to use as an example, explains how to take advantage of the CUDA GPU architecture to provide maximum speedup for your CUDA applications using a Mandelbrot set generator as an example.
Patternlets in Parallel Programming
Material originally created by Joel Adams, Calvin CollegeCompiled by Libby Shoop, Macalester College
Short, simple C programming examples of basic shared memory programming patterns using OpenMP and basic distributed memory patterns using MPI.
Sequential and parallel versions of a Monte Carlo simulation of the spread of infectious disease are presented in detail. Students can run the code and examine performance of sequential and parallel versions.
Timing Operations in CUDA
Joel Adams, Calvin College, and Jeffrey Lyman, Macalester College
Through completion of vector addition, multiplication, square root, and squaring programs, students will gain an understanding of when the overhead of creating threads and copying memory is worth the speedup of GPU coding.