Data Management
Initial Publication Date: July 27, 2016
Introduction
Students will organize data in a format that allows for data manipulation and analysis using a programming language, such as Python or R.
Conceptual Outcomes
Students will understand basic data management principles, such as file organization and consistent data structure.
Practical Outcomes
Students will be able to use Excel to organize data.
Time Required
30 minutes
Computing/Data Inputs
Computing/Data Outputs
Hardware/Software Required
Excel
Instructions
Although we downloaded data from several locations along the river, we're going to treat the data as having been collected from three river reaches: estuarine, transitional, and upstream.
- Estuarine = jacksonville_fl_32226_usa and st._johns_river_at_jacksonville_fl
- Transitional = palatka_fl_32177_usa
- Upstream = fort_mccoy_fl_32134_usa and st._johns_river_at_astor_f
- The default file names may be too long to open in Excel. Shorten the file names without removing any useful information. For example, change nwisuv-st_johns_r_dames_point_bridge_at_jacksonville__fl-dissolved_oxygen__water__unfiltered__milligrams_per_liter.csv to estuarine-DO_mgL.csv. Do this for all of the files.
- So that we all have the same directory structure, please rearrange your files so that they mirror the organization shown below:
×
- For this analysis, we only need (1) data measurements and (2) corresponding time of collection, but a lot of extra information is included within each file. Open each file in Excel, and delete all file contents EXCEPT the values ("Value") and time of collection ("LocalTimestamp"). Do this for all 9 data files. Once cleaned, the content of each file should be structured as follows:
×