Data Management

Natalie Nelson, University of Florida
Author Profile

Introduction

Students will organize data in a format that allows for data manipulation and analysis using a programming language, such as Python or R.

Conceptual Outcomes

Students will understand basic data management principles, such as file organization and consistent data structure.

Practical Outcomes

Students will be able to use Excel to organize data.

Time Required

30 minutes

Computing/Data Inputs



Computing/Data Outputs

Hardware/Software Required

Excel

Instructions

Although we downloaded data from several locations along the river, we're going to treat the data as having been collected from three river reaches: estuarine, transitional, and upstream.

  • Estuarine = jacksonville_fl_32226_usa and st._johns_river_at_jacksonville_fl
  • Transitional = palatka_fl_32177_usa
  • Upstream = fort_mccoy_fl_32134_usa and st._johns_river_at_astor_f
Let's get to organizing!
  1. The default file names may be too long to open in Excel. Shorten the file names without removing any useful information. For example, change nwisuv-st_johns_r_dames_point_bridge_at_jacksonville__fl-dissolved_oxygen__water__unfiltered__milligrams_per_liter.csv to estuarine-DO_mgL.csv. Do this for all of the files.
  2. So that we all have the same directory structure, please rearrange your files so that they mirror the organization shown below:
  3. For this analysis, we only need (1) data measurements and (2) corresponding time of collection, but a lot of extra information is included within each file. Open each file in Excel, and delete all file contents EXCEPT the values ("Value") and time of collection ("LocalTimestamp"). Do this for all 9 data files. Once cleaned, the content of each file should be structured as follows:

Additional Activities and Variants

Related Steps