Using Autocorrelation and Cross-correlation to Explore Links Between River Discharge and Regional Climate

Peter N. Adams
University of Florida,
Author Profile

This activity was selected for the Teaching Computation in the Sciences Using MATLAB Exemplary Teaching Collection

Resources in this collection a) must have scored Exemplary or Very Good in all five review categories, and must also rate as “Exemplary” in at least three of the five categories. The five categories included in the peer review process are

  • Computational, Quantitative, and Scientific Accuracy
  • Alignment of Learning Goals, Activities, and Assessments
  • Pedagogic Effectiveness
  • Robustness (usability and dependability of all components)
  • Completeness of the ActivitySheet web page

For more information about the peer review process itself, please see https://serc.carleton.edu/teaching_computation/materials/activity_review.html.



This page first made public: Oct 12, 2015

Summary

Students conduct autocorrelation and cross-correlation analyses on river discharge and climate indices to test the hypothesis that coastal streams draining mountainous terrain are strong indicators of climatic phenomena. Students must load the data, conduct analyses, and plot the results by writing an efficient MATLAB script, and must "publish" their code into a well-organized, well-commented, .pdf document (or .html on a website).

Learning Goals

From this activity, students will learn how to: (1) examine time series data, (2) write and call a function that conducts autocorrelation analysis, (3) conduct and interpret the results of a cross-correlation analysis. Basic MATLAB commands are used to plot the data, and some functions are used to calculate correlation coefficients and statistical significance. Higher-order thinking skills are practiced by the students when evaluating the results of the cross-correlation analysis in a geologic context. The MATLAB publishing tools is used to organize and present the results.

Context for Use

This activity is targeted at upper-level undergraduate and intro-level graduate students in Earth Sciences and is expected to be started in class, to develop confidence with the problem, then completed as an out-of-class activity, within one week after being assigned. Students will practice writing, and commenting, a MATLAB script, will gain familiarity with bivariate statistical methods and an introduction to time series by writing a code to conduct autocorrelation and cross-correlation from scratch. They are then encouraged to learn the MATLAB commands to run such analyses and interpret their results in a climate-landscape context. Students use functions contained in the statistics toolbox. This activity is assigned late in the course as it requires some programming skills expected to be developed earlier in the semester.

Description and Teaching Materials

The detailed activity description is provided in the attached file "TimeSeriesCorrelActivity_RiversAndClimate.pdf". In this two-part activity, students will first conduct an autocorrelation on time series data of river discharge (runoff) to demonstrate a quantitative method to illustrate seasonality in a streamflow data set. After building their confidence in the first part (Autocorrelation), they will conduct a cross correlation between two of 3 data sets to test a hypothesis that discharge of small coastal rivers are strongly influenced by regional climate. For each part of the activity, students must: (1) import data and correctly establish the temporal format, (2) call a program to properly conduct the appropriate analysis (autocorrelation or cross-correlation), and (3) display and interpret the results. The students must present the analysis with a well-documented, MATLAB script that has been published to a .pdf document or as .html and uploaded to a website for public access.

Student Handout for Time Series Correlation Activity (Acrobat (PDF) 443kB Oct14 15)

Data files for Time Series Correlation Activity (Zip Archive 14kB Oct13 15)

ser_corr_fcn.m file for Time Series Correlation Activity (Matlab File 820bytes Oct13 15)

Teaching Notes and Tips

The main challenge in this assignment is to have the students successfully program a script, or better yet a function, that includes a loop to calculate cross correlation between two time series (Part B). The students are shown an example of how this is done, via Gerry Middleton's code (ser_corr_fcn.m) in Part A, in order to give them guidance. To further help the students, it's recommended that the instructor review the concept of correlation, from the "ground up", so the students can clearly see what the correlation code is calculating. There are MATLAB functions to perform these calculations, but they are in the Signal Processing and Econometrics toolboxes, respectively, which may not be accessible to all students.

Another major struggle for the students will be the concept of statistical significance to establish a criterion for relevance of their analytical results. This concept should be reviewed in the classroom.

Lastly, I've found that having the students publish to .html is empowering, as it provides them with a mechanism of producing web-friendly content. It also provides a tidier way for the instructor to grade the assignment, by simply clicking on a link that the student has submitted electronically. However, I recommend working out the kinks of url accessibility ahead of time – we've wasted a lot of time solving "permissions" issues, where a student might be the only user to whom the web content is visible.

Assessment

If the student produces a code that: (1) runs cleanly, (2) produces figures that allow interpretation, and (3) is well-documented (commented), the student receives a passing grade. As with all MATLAB-type activities, there is not "one and only one correct way" of solving this assignment. The goal is to develop programming skills, so I encourage students to meet with me so I help show them the process of debugging their code to converge on something that runs.

References and Resources

Four data sets are used in this assignment. The first is provided, after being obtained from the supporting materials for the book "Statistics and Data Analysis in Geology" by John C. Davis. This data set is freely available from the publisher on the following site:

http://www.kgs.ku.edu/Mathgeo/Books/Stat/index.html

The monthly data from the San Lorenzo River near Big Trees CA (USGS Stn. 11160500) is available on the course website, [but originally came from: http://waterdata.usgs.gov/nwis/nwisman/?site_no=11160500&agency_cd=USGS]

The time series for the Multivariate ENSO Index (MEI) is available on the course website, [but originally came from: http://www.cdc.noaa.gov/people/klaus.wolter/MEI/mei.html]

The time series for the PDO Index Monthly Values is available on the course website, [but originally came from: http://jisao.washington.edu/pdo/]

The hypothesis to be test was derived from the following paper by Milliman and Syvitski:
Milliman, J.D., and Syvitski, J.P.M. (1992) Geomorphic/tectonic control of sediment discharge to the ocean: The importance of small mountainous rivers. Journal of Geology; v.100(5), 525-544.

Advertisement