High-Frequency Sensor Data Quality Control
Summary
There has been a substantial increase in the availability of high-frequency environmental data in recent years. The rapidly increasing amount of data generated by high-frequency sensors requires that ecologists possess a new set of computational skills. Students will learn how to prepare aquatic high-frequency sensor data for analysis by learning how to identify, correct, and visualize common issues with sensor data.
Learning Goals
- Discuss the added value of high-frequency sensors.
- Prepare data from high-frequency sensors for analysis and visualization.
- Explain the difference between data quality assurance (QA) and quality control (QC). Understand when each process is typically performed.
- Explain common quality control issues with sensors.
- Apply standard QC protocols to correct common mistakes.
Context for Use
This module can be completed in one 3 hour lab period or two 1 hour 30 minute lecture periods for intermediate or advanced level students. Students should be familiar with the R coding language and environment, and fundamentals including installing packages. Note that code is provided for students for Activities A and B. Alternatively, the instructor may choose to do Activities A and B as a code along with the entire class if students are less familiar with R.
Description and Teaching Materials
Quick overview of the activities in this module:
Activity A: Identify and fix common data formatting issues from automated sensors.
Activity B: Plot, identify, and correct common sensor data quality control issues.
Activity C: Apply proper data QC protocols to a new data set.
Workflow of this module:
1. Students should complete any readings before arriving to class.
2. Instructor will begin with a brief overview of environmental sensors using the provided PowerPoint.
3. Students can then work through the provided document to complete the activities.
Teaching Materials
sensor_qc_instructor.pptx (PowerPoint 2007 (.pptx) 4.2MB Jul29 25)
sensor_qc_module_student_without_answers.docx (Microsoft Word 2007 (.docx) 86kB Jul29 25)
JP_TS_001_2023-05-01_2023-06-06_2.csv (Comma Separated Values 1.5MB Jul29 25)
TI_VPdata_ProjectEddie.csv (Comma Separated Values 2MB Jul29 25)
sensor_qc_instructors.docx (Microsoft Word 2007 (.docx) 81kB Jul29 25)
ProjectEddie_Read_WestBrook2023.R (R script 11kB Jul29 25)
Teaching Notes and Tips
We recommend instructors review the sensor_qc_instructors document before teaching this module.
Note that R script files and answer sheets are provided.
There are no required readings, but the following book chapter may be useful:
Rose, Kevin C., Christopher G. McBride, and Vincent W. Moriarty. "Creating and Managing Data From High-Frequency Environmental Sensors." (2022). Encyclopedia of Inland Waters (Second Edition) Volume 4, Pages 549-569. DOI: 10.1016/B978-0-12-819166-8.00197-3
Online at: https://www.sciencedirect.com/science/article/abs/pii/B9780128191668001973?via%3Dihub
The instructor may choose to add their own readings.
Assessment
| Phase | Functions | Examples from this module |
| Engagement | Introduce the value of high-frequency sensors and the added challenges when using this data. | Introduction of topics via lecture by instructor. |
| Exploration |
Activity A: Engage students in the structure of high-frequency data and investigate common formatting errors in a real data set.
|
Students will use R to identify, fix, and visualize common data formatting and structure issues that arise with sensor data using hints and code provided in the student word document.
|
| Explanation |
Activity B: Engage students in the sensor data correction process.
|
Students will use hints and code from the student document to correct faulty sensor data. Students will create a "before and after" plot to visualize how they corrected the data. |
| Expansion | Activity C: Demonstrate ability to correct sensor data with a new data set on their own. | Students will use what they learned in activities A and B to correct a different high-frequency data set. Background information is provided in the PowerPoint. Students will then identify visually when Lake Geroge turned over using dissolved oxygen and temperature sensor data. |
| Evaluation | Assess students' understanding. | Students will submit their word document for evaluation by instructor. |
References and Resources
This module was supported by US National Science Foundation grants 2048031 and 1754265 and the Jefferson Project. The Jefferson Project is a multi-disciplinary research effort founded by Rensselaer Polytechnic Institute, IBM, and the Lake George Association. The Jefferson Project was supported by US National Science Foundation grant 1625044 and a New York State Higher Education Capital grant (#7290).
Data used in the following examples comes from the Jefferson Project.
The Jefferson Project is a multi-disciplinary research effort founded by Rensselaer Polytechnic Institute, IBM, and the Lake George Association. The Jefferson Project operates a network of sensors and sensor platforms on Lake George, NY, USA. Sensor platforms include vertical profilers, which raise and lower sensors through the water column of the lake at an ~ hourly basis, stream monitoring systems, and stand-alone sensors in other locations.
Data in this module has been modified for teaching purposes. Data should be obtained from the link below for any research investigations.
Click here to see a list of publicly available data from The Jefferson Project