Time Series Modeling and Prediction of Environmental Data

This module was developed by Lofton, M.E., M.R. Hipsey, K. Kurucz, and C.C. Carey. 8 August 2025. Macrosystems EDDIE: Time Series Modeling and Prediction of Environmental Data. Macrosystems EDDIE Module 11, Version 1. https://serc.carleton.edu/eddie/teaching_materials/modules/module11.html. Module development was supported by NSF grants EF 2318861, DEB 2213550, OISE 2330211.

Initial Publication Date: August 8, 2025

Summary

Advances in environmental sensor technology in recent decades are enabling the collection of environmental data at high temporal frequencies (e.g., every 10 minutes) across many ecosystems. These time series of environmental data can be used to gain information about previous and current conditions of ecosystems, as well as make predictions about ecosystem conditions in the future. To help researchers and managers understand the complex patterns that can occur in high-frequency environmental time series data, they commonly apply a range of time series models. These models use statistical and machine learning methods to identify signals in high-frequency environmental data.

In this module, students will apply several different time series and machine learning models to explore, analyze, and interpret environmental data. They will explore data from an environmental case study of their choice, choose which environmental variables to use to fit a time series model, assess the model, and apply the model to make predictions of future ecosystem conditions. Then, students will process a new dataset into a standardized format, upload it into the module's R Shiny App, and fit several other models to compare predictive performance across models. Students will also evaluate the ecological understanding that can be drawn from each model (e.g., which driver variables are important for explaining the dynamics of the target environmental variable being predicted).

Module focal question: How can we use time series models to understand and predict ecosystem conditions?

The overarching goal of this module is for students to learn fundamental concepts about time series modeling and fit and assess time series models using environmental data. Students will work with an R Shiny App interface to visualize data, fit a model, assess the model, and then fit and compare several additional environmental models. The A-B-C structure of this module makes it flexible and adaptable to a range of student levels and course structures.


Learning Goals

By the end of this module, students will be able to:

- Explore relationships between ecological variables from environmental case studies.
- Understand the structure of four time series models (including machine learning models) that are commonly applied in environmental science.
- Fit time series models using environmental data and assess the importance of ecological driver variables in making model predictions.
- Process environmental datasets into standardized formats suitable for training and assessing time series models.
- Compare multiple time series models to assess their performance on out-of-sample predictions.

Context for Use

This entire module can be completed in one 3-hour lab period or three 60-minute lecture periods for introductory undergraduate students in Ecology, Environmental Science, Ecological Modeling, and Quantitative Ecology classes. After the introductory 30-minute presentation, we found that students completed Activity A in about 1 hour, with Activities B and C taking about 20-30 minutes each. However, if the instructor chooses to have students work with a dataset that is not provided within the module for Activity B, the instructor should allocate adequate additional time for students to learn about and process this dataset into a standardized format.

Description and Teaching Materials

Quick overview of the activities in this module

See the instructor manual, provided below, for a step-by-step guide for carrying out this module. A student handout describing Activities A, B, and C, and instructor PowerPoint are also provided.

  • Activity A: Students visualize data from a selected environmental data case study, and fit and assess a time series model.
  • Activity B: Students choose a new environmental dataset or upload their own dataset and fit and assess a time series model.
  • Activity C: Students fit additional time series models to the environmental dataset used in Activity B and compare performance across models.

Workflow for this module:

  1. Instructor chooses a method for accessing the Shiny app (regardless of which method you pick, all module activities are the same!):
    1. In any internet browser, go to:https://macrosystemseddie.shinyapps.io/module11/
      1. This option works well if there are not too many simultaneous users (<20)
      2. The app generally takes several seconds to load and requires consistent internet access
      3. It is important to remind students that they need to save their work as they go, because this webpage will time-out after 15 idle minutes. It is frustrating for students to lose their progress, so a good rule of thumb is to get them to save their progress after completing each Shiny App activity objective
      4. Note that for this module, progress-saving is only available for Activity A, because students upload their own data to use for Activities B and C, and we are unable to store users' data within the app. Students and instructors should plan to allocate 45-60 minutes to complete Activities B and C or be prepared to re-upload their data and regenerate plots within the app.
    2. The most stable option for large classes is downloading the app and running locally, see instructions at:https://github.com/MacrosystemsEDDIE/module11
      1. Once the app is downloaded and installed (which requires an internet connection), the app can be run offline locally on students' computers
      2. This step requires R and RStudio to be downloaded on a student's computer, which may be challenging if a student does not have much R experience (but this could be done prior to instruction by an instructor on a shared computer lab)
      3. If you are teaching the module to a large class and/or have unstable internet, this is the best option
  2. Give students their module report (or ask them to download it from the app on the Introduction page at https://macrosystemseddie.shinyapps.io/module11/) ahead of time to read over prior to class or distribute reports when students arrive to class. The report includes optional pre-class readings and a short pre-class activity introducing some current examples of ecological forecasts. Students will also answer the questions embedded throughout the R Shiny App in the module report, which could be submitted to the instructor for grading if desired.
  3. Instructor gives a brief PowerPoint presentation that introduces time series modeling and the environmental case studies which students can explore during the module (~30 mins).
  4. After the presentation, the students divide into pairs. Each pair selects their own environmental case study and visualizes their case study's data, which is used to fit and assess a time series model (Activity A). The two students within a pair each build their own model with unique inputs and parameters to compare the performance of two different models for the same ecosystem. For virtual instruction, we recommend putting two pairs together (n=4 students) into separate Zoom breakout rooms during this activity so the two pairs can compare results.
  5. The instructor then introduces Activities B and C, potentially revisiting some of the slides from the introductory presentation as a reminder to students about the next steps. For virtual instruction, this would entail having the students come back to the main Zoom room for a short check-in.
  6. The students work in their pairs to process and upload data in a standardized format and fit a new time series model to their uploaded data (Activity B). Note: the amount of time required for data wrangling may vary widely depending on the datasets the instructor chooses to use for the module. The datasets provided within the module require minimal data wrangling that can be done in Microsoft Excel (~5-10 minutes depending on student experience). However, instructors should plan to allocate additional time if they are asking students to do more extensive data wrangling prior to uploading their own data in Activity B. For virtual instruction, we recommend putting the two pairs back into the same Zoom breakout rooms. Optionally, instructors may bring the class back together at the end of Activity B to discuss performance of students' time series models before beginning Activity C.
  7. Student pairs then fit additional time series models to their uploaded datasets and compare performance across models on out-of-sample predictions (Activity C). The students work together in groups to present the results from their dataset and different models to the rest of the class. The class may discuss why the models perform similarly or differently among the different sites, as well as the skill of more complex models (with environmental drivers) vs. simple models at various sites.

Teaching Materials:

Teaching Notes and Tips

Important Note to Instructors:

The R Shiny App used in this module is continually being updated, so these module instructions will periodically evolve to account for changes in the code. If you have any questions or have other feedback about this module, please contact the module developers (see "We'd love your feedback" below).

We highly recommend that instructors familiarize themselves with the R Shiny App prior to the lesson. This will enable the instructors to be more prepared to answer questions related to certain areas of the app's functionalities.


Assessment

Students download a Word document from the Introduction page of the R Shiny App which they use to answer questions embedded throughout the module. This report can be submitted to the instructor for assessment.

  • Activity A: Students assess the performance of the time series model they have trained using environmental data.
  • Activity B: Students process data (either one of the provided datasets or their own dataset) into a standardized format in order to successfully upload it into the R Shiny App, then assess a second time series model trained on their new dataset.
  • Activity C: Students assess and compare several different time series models on the data used in Activity B on out-of-sample predictions.

References and Resources

Optional pre-class readings:

- Tredennick, A. T., Hooker, G., Ellner, S. P., & Adler, P. B. (2021). A practical guide to selecting models for exploration, inference, and prediction in ecology. Ecology, 102(6), e03336. https://doi.org/10.1002/ecy.3336

- Wickham, H., Çetinkaya-Rundel, M., and Grolemund, G. (2023) R for Data Science (2nd ed.), Chapter 5: Data Tidying. https://r4ds.hadley.nz/data-tidy.html

Module authorship contributions: MEL led the development of all module materials (R Shiny App and associated code, PowerPoint presentations, instructor manual, student handout). The module was conceptualized by MEL, CCC, and MRH. KK provided dataset assistance and all coauthors provided feedback on the module content and materials. Funding was provided by CCC.