Quantitative Data Analysis (GLY3455)

Scott T. Marshall
Appalachian State University

Summary

Modern Earth and environmental scientists deal with complex and often very large quantitative data sets that are typically not useful or understandable in raw form. The main goal of this course is to provide a computational and quantitative skill set relevant for processing, filtering, analyzing, and visualizing quantitative Earth science data efficiently and accurately.


Course URL: http://www.appstate.edu/~marshallst/GLY3455/
Course Size:
15-30

Course Format:
Integrated lecture and lab

Institution Type:
University with graduate programs, primarily masters programs

Course Context:

This is a mid-upper level undergraduate course with prerequisites of a sophomore-level geology class, calc-based physics, and calculus. The class teaches programming skills with no previous coding experience required. All coding is done in the language, MATLAB. The course is cross-listed in our environmental science program. The class is currently required for our "Quantitative Geoscience" degree track, but it is an elective for all of our degree tracks.

Course Content:

The main goal of this course is to provide a computational and quantitative skill set relevant for processing, filtering, analyzing, and visualizing quantitative Earth science data efficiently and accurately. By completing this course, students will gain experience in the basics of computer programming, data visualization, and mathematical principles relevant to Earth sciences. Students will learn to make their own custom tools that automate computation and visualization tasks so that a problem need only be solved once. While much of the lecture content will focus on the programming basics necessary to utilize MATLAB, the lab will focus on Earth science applications of programming. The overarching goal is that the course will demonstrate the wide applicability of computation in the Earth sciences and provide students with the confidence to pursue quantitative research projects during their academic and professional careers.

Course Goals:

After this course, students should be able to:
-handle data, no matter what the format and/or size.
-automate common data analysis tasks, such as visualizing and filtering data
-recognize that coding is relevant and highly valuable to Earth scientists in all disciplines.

Course Features:

Lecture introduce programming syntax and style that is correct and most efficient.
Lab activities all use some type of geologic data set where the students must write code to process the data. This is where students see the relevance of the coding skills.

Course Philosophy:

I just chose a design that I thought would be effective. The key is that this type of class needs a lab setting, so students can interact with the instructor, and the instructor can create code on the fly. This allows students to see that even

Assessment:

If they do well in the class, they should be at least competent at basic data analysis. The more important measure is that we now have a handful of students working with various faculty from different subdisciplines that now use MATLAB extensively in their independent research. It has really changed the research capability of our students. The difference is dramatic.

Syllabus:

Course Syllabus (Acrobat (PDF) 571kB Oct12 15)

Teaching Materials:


References and Notes:

Text - "Matlab: A Practical Introduction to Programming and Problem Solving" By: Stormy Attaway
Because it is well written and essentially acts as a introduction to writing basic algorithms in MATLAB. I have experimented with geology specific texts, but they are too discipline specific and are better geared towards students that are already experienced coders. For example one of the geology specific texts I used in the past did a MATLAB intro in one chapter that essentially covered everything in my whole course. Students need to be walked through the basics at a slow pace, or they will get frustrated.
My advice: keep the coding very basic and use easy to understand, but large datasets in your assignments.

I use a few peer-reviewed papers. They are listed on my website.
I use these materials when a dataset is presented that we can also process and come to a similar result as the original authors.

I prefer open source (and free) languages, but my colleagues encouraged me to stick with MATLAB since it is very widely used in Earth sciences and is relatively user friendly.
No. I do everything in MATLAB in this class. I think it is bad to mix different languages in an intro coding class. In a second course, I could see the utility of using several different languages...but not in an intro class.