Analyzing GPS time series data in Central California
Summary
Learning Goals
Learning goals:
-Search, download, import and format time series data
-Visually inspect data and give a preliminary interpretation
-Perform a linear regression
-Analyze the spectral content of the signal using Matlab's Signal Processing toolbox
-Interpret the spectral content in a geologic context.
Higher-order skills: computation, data analysis
Context for Use
Description and Teaching Materials
In-class activity: Time series analysis with GNSS data
This activity is designed for advanced undergraduate Geology students in a Data Analysis course, to introduce them to time series analysis. No programming experience is assumed, but students are expected to have completed the Matlab Onramp and Core Matlab Skills online modules. They have also learned the basics of probability distributions, confidence intervals, and linear regression.
For this exercise you will look at cumulative vertical displacement at a GPS station in the Central Valley of California, where rapid groundwater withdrawal is known to cause subsidence of the surface. The goal is to (a) estimate the average subsidence velocity based on the cumulative displacement time series using linear regression and (b) analyze cyclic behavior in the displacement signal using autospectral analysis. The activity can be adapted to any type of time series data that has both cyclic behavior and secular trends.
Step 1: locate, download, import and plot the data
1. Go to geodesy.unr.edu
Under the "quick links" you can click on "browse GPS stations" and look for the station near Mendota, California (it should be station P304). Alternatively, you can click the link "All stations processed" and scroll down to P304.
2. Click on station P304 (in Mendota, California). This will show you a snapshot of displacement vs time in East, North and Up coordinates, relative to the reference frame listed at the top (in this case IGS 14). Click on the map, and it will take you to a data download page. You can check out the different readme files to see the options of format available for downloading. Use the "tenv3" format, which is a space-delimited ascii file that includes motion in the "East, North, Up" coordinate system. Download the readme file for tenv3 format as a reference, so you know what each column contains. Since it is just an ascii file in matrix format, you can change its extension to .txt so Matlab can recognize it.
3. Import the data to Matlab.
There are a number of ways to do this, depending on the structure of the file and data you want to import. Use the "readtable" function, which will import mixed data types in a table format (the file includes strings as well as numbers). Import your data and give it a name
data_file = 'your-data-file-name.txt'
p304_data = readtable(data_file)
4. Display the first row of your data: p304_data(1, :). The columns should have headers, which you can refer to to index your data. . If you type p304_data.yyyy_yyyy(1:3), you will get the first 3 values of date in decimal years.
5. Using the readme file as a reference, select the variables you are going to use in order to analyze your "up" motion time series, and assign them names.
You will need: the column with "decimal year", and the ones with "integer part of vertical" and "fractional part of vertical(m)". (Note: you should only really need the decimal parts, but keep the integer ones just in case).
Example: I'm going to take the column named "yyyy_yyyy" in my data and call it "X", and the decimal part of my cumulative displacement (in mm) "Y"
X = p304_data.yyyy_yyyy;
Y = p304_data.x__up;
6. Make a clear, labelled plot of your data, with title.
Step 2: Find the long-term trend in the data, and remove it from your signal
1. Use Matlab's "regress" function to find the best-fit linear trend in the data. Plot the trend on the same plot as the data.
[b,bint] = regress(y,X) returns a vector b of regression coefficients, and a matrix bint of 95% confidence intervals for the regression coefficients.
2. Using the regression coefficients, subtract the trend line from the data, and plot the residuals.
(Note that you can also obtain the residuals directly from the regress function using
[b,bint, r] = regress(Y,X). r is a vector of residuals)
3. Calculate the average velocity of subsidence at this station over the measurement period of the station.
Step 3: Analyze the cyclic behavior of the displacement time series
In the residuals, you should be able to see some cyclic components of the data. You will now apply a Welsh spectral analysis using the Matlab function periodogram.
First specify the arguments you will use for the periodogram, including the window and number of points to use in the discrete Fourier transform:
window = []; (default window: a rectangular window)
nfft = length(X);
fs = 1; %sampling frequency
[Pxx, f] = periodogram(Y, window, nfft, fs)
magnitude = abs(Pxx);
plot(f, magnitude)
xlabel('Frequency')
ylabel('Power')
title('Power Spectral Density Estimate')
4. Discuss your results: Which frequencies are visible above the noise floor? What periodic processes may explain the cyclic behavior of surface deformation at each of these frequencies?