Connecting Physical and Biological Ocean Parameters: WOCE data analysis using Matlab

Felix Jose, Florida Gulf Coast University, Marine & Earth Sciences

Author Profile
Initial Publication Date: October 28, 2025

Summary

Ocean is a coupled physical-biological system. The physical structure of the water column, particularly its vertical stratification, acts as a primary control on biological activity. Strong stratification (the "pycnocline") serves as both a barrier to vertical mixing and a critical interface for life. It isolates the sunlit euphotic zone from the nutrient-rich deep water, and in doing so, creates distinct vertical niches for phytoplankton and shapes the distribution of dissolved oxygen.

This project will leverage the high-quality, standardized NetCDF dataset from the World Ocean Circulation Experiment (WOCE) and conduct all analysis within the MATLAB environment. We will move beyond simple profile analysis to quantify and correlate the physical drivers of stratification with the resulting biological signatures.

Share your modifications and improvements to this activity through the Community Contribution Tool »

Learning Goals

Core Research Questions

  1. Physical Structure: How do water column density and static stability (stratification) vary vertically and horizontally along the WOCE cruise track?
  2. Biological Features: What is the vertical and horizontal distribution of key biological indicators (Chlorophyll-a and Dissolved Oxygen)?
  3. The Physical-Biological Link: What is the quantitative relationship between the depth and strength of the pycnocline and the depth and magnitude of the Deep Chlorophyll Maximum (DCM) and the oxygen minimum/maximum layers?

Learning Goals

  • Proficiency in Data handling: Confidently load, parse, and manage complex, multi-dimensional scientific datasets in NetCDF format within the MATLAB environment.
  • Apply Oceanographic Theory: Apply first principles of physical oceanography to process raw data (T, S, P) into advanced derived variables (e.g., Potential Density, Buoyancy Frequency) using the standardized TEOS-10 GSW toolbox.
  • Develop Computational Skills: Develop and implement custom MATLAB scripts to programmatically analyze data, including algorithms to automatically detect and extract key oceanographic features (e.g., pycnocline depth, DCM depth) from vertical profiles.
  • Visualize & Analyze Data: Construct and interpret complex 2D visualizations (e.g., contourf plots) to analyze the spatio-temporal structure of oceanographic transects.
  • Synthesize & Evaluate: Move beyond simple description to statistically correlate physical and biological datasets, allowing for a quantitative evaluation of the core research hypothesis (i.e., that physical stratification controls biological feature depth).
  • Communicate Scientific Findings: Clearly articulate a complex physical-biological relationship using a combination of data-driven figures, statistical evidence, and written analysis in a formal scientific report.

Context for Use

This project is highly relevant as it moves beyond a simple inventory of ocean properties to explore the complex connection between the physical and biological realms. Students will investigate precisely how water column stability acts as the ocean's physical "architecture," creating distinct layers and barriers. This physical architecture directly governs the biogeochemical landscape, dictating the depth of the Deep Chlorophyll Maximum and the distribution of dissolved oxygen. The pedagogical context of this project is therefore twofold: first, to explore this critical scientific link, and second, to demonstrate MATLAB's power in unraveling this complexity. By leveraging MATLAB, students learn to manage and analyze large-scale, real-world WOCE NetCDF data, moving from raw measurements to derived physical variables. Ultimately, they will use the platform to computationally test their hypotheses and create powerful visualizations that make the invisible link between stratification and life clearly visible.

Description and Teaching Materials

5. Methodology & Analysis Plan

This project will be executed in three phases, entirely within the MATLAB environment.

Phase 1: Data Preparation & Physical Analysis

The goal of this phase is to define the physical "template" of the ocean.

  1. NetCDF Data Ingestion:
    • Use MATLAB's built-in functions to inspect and load the data.
    • ncinfo('filename.nc'): To inspect the file structure, variable names, and dimensions.
    • ncread('filename.nc', 'variable_name'): To extract variables like temp, psal, pres, etc., into MATLAB matrices.
  2. Derived Variable Calculation:
    • This must be done using the MATLAB version of the TEOS-10 GSW Oceanographic Toolbox (which must be downloaded and added to the MATLAB path).
    • Key functions:
      • gsw_SA_from_SP: To calculate Absolute Salinity from Practical Salinity.
      • gsw_CT_from_t: To calculate Conservative Temperature from in-situ temp.
      • gsw_sigma0: To calculate Potential Density Anomaly (referenced to 0 dbar).
      • gsw_Nsquared: To calculate Buoyancy Frequency ($N^2$), the direct measure of stability.
  3. Vertical Profile Analysis:
    • For 2-3 representative casts, create a 4-panel "stacked plot" using MATLAB's subplot and plot functions to visualize the vertical structure of $\sigma_0$, $N^2$, chla, and doxy vs. pres.
  4. Transect Analysis:
    • Generate 2D matrices of each variable (e.g., temp_transect, N2_transect), where columns represent individual casts and rows represent pressure levels.
    • Create 2D contour plots using contourf (filled contour) or pcolor. The x-axis will be Longitude (or distance), and the y-axis will be Pressure.
    • Example: contourf(longitude, pressure, N2_transect, 'LevelStep', 1e-5)
    • Self-Correction: Use set(gca, 'YDir', 'reverse') to ensure pressure (depth) increases downwards.

Phase 2: Biological Feature Analysis

The goal of this phase is to map the biological response onto the physical template.

  1. Transect Analysis:
    • Create 2D contour plots for chla and doxy using contourf as in Phase 1.
    • Visual Analysis: Overlay the main pycnocline (e.g., the 27.5 $\sigma_0$ isopycnal) onto the biological plots using hold on and the contour function (not contourf).
  2. Feature Extraction:
    • Write a MATLAB script (or function) that loops through every cast (i.e., each column of your data matrices).
    • Inside the loop, use the max function to find the peak value and its index:
      • [max_N2, N2_index] = max(N2_profile);
      • [max_Chl, Chl_index] = max(Chl_profile);
    • Use the returned index to find the corresponding pressure of that peak:
      • pycnocline_depth = pressure(N2_index);
      • DCM_depth = pressure(Chl_index);
    • Store these depths in a new vector for correlation.

Phase 3: Synthesis & Interpretation

This phase directly links the physics and biology using the data from 2.2.

  1. Co-location Plots (Scatter Plots):
    • Use MATLAB's scatter function to plot:
      • scatter(pycnocline_depth_vector, DCM_depth_vector)
    • Add a 1:1 line with hold on; plot(xlim, xlim, 'r--'); to test for co-location.
    • Plot Depth of DO Max vs. Depth of DCM.
  2. Property-Property Plots:
    • Use scatter to plot Chlorophyll vs. Buoyancy Frequency ($N^2$) for all data points. This will show if high Chl values are only found in highly stable (high $N^2$) water.
    • Plot Dissolved Oxygen vs. Potential Density ($\sigma_0$) to show how DO content relates to specific density layers.
  3. Final Interpretation:
    • Synthesize the plots to answer the research questions, explaining the physical-biological coupling.





Teaching Notes and Tips

Instructors can assign matlab onramp modules on intro to Matlab as homework prior to assigning this class project.
Cruise data from temperate and sub-Arctic geographic locations are recommended as the stability of the water column depends heavily on salinity distribution, rather than temperature, which is the case for tropics and sub-tropics.


Assessment

Project Assessment

This project is a major component of the course grade and is designed to assess your ability to integrate scientific concepts with technical skills. Your grade will be based on three components: the functionality and quality of your MATLAB code, the scientific rigor and clarity of your final report, and your ability to communicate your findings in an oral presentation. Each component will be graded on a 10-point scale and weighted as follows.

Grading Distribution:

  • Final Project Report: 50%
  • MATLAB Scripts (Functionality, Clarity, and Commenting): 30%
  • Final Presentation: 20%

 

Assessment Rubric (10-Point Scale)

 

Criterion Unsatisfactory / Developing (1-5) Proficient / Target (6-8) Exemplary (9-10)
MATLAB Implementation & Code Quality (1-3) Code is incomplete, non-functional, has significant errors, or is extremely difficult to follow. (4-5) Code runs, but is inefficient, poorly structured, has minimal comments, or uses the GSW toolbox incorrectly. (6-8) Code is complete, correct, and achieves all project objectives. GSW toolbox is used properly. Commenting is sufficient to follow the logic. This is the target for a successful project. (9-10) Code is efficient, elegant, and well-structured (e.g., using functions). Commenting is thorough. Demonstrates a mastery of MATLAB beyond the basics.
Data Analysis & Visualization (in Report) (1-3) Plots are incorrect, uninterpretable, or largely absent. (4-5) Some plots are missing, inappropriate for the data, or poorly labeled (e.g., missing units, no titles, unreadable fonts). (6-8) All required plots (profiles, transects, scatter) are present, correct, and clearly labeled with units, axes, and titles. They effectively support the report's main arguments. (9-10) All visualizations are professional, clear, and insightful. Overlays and plot choices (e.g., color maps) strongly enhance the scientific story.
Scientific Interpretation & Synthesis (1-3) Interpretation is incorrect or demonstrates a fundamental misunderstanding of stability. (4-5) States the relationship (e.g., "DCM is at the pycnocline") but fails to explain the why using physical principles. Interpretation is superficial. (6-8) Correctly identifies and describes the physical-biological relationship. Links the $N^2$ peak to the DCM and DO features. Explanation of the mechanisms is correct but may be general. (9-10) Provides a deep, insightful analysis of the mechanisms (light/nutrient trade-off, respiration). Critically and quantitatively links $N^2$ to the biological features.
Final Report (Structure & Writing) (1-3) Fails to follow format; writing is incomprehensible. (4-5) Missing key sections (e.g., Methods, Discussion). Writing is unclear. Figures are included as "data dumps" with no in-text discussion. (6-8) Follows a professional scientific format (Abstract, Intro, etc.). Writing is clear and all figures are referenced in the text. Minor grammatical or formatting issues may be present. (9-10) Follows all formatting guidelines perfectly. Writing is clear, concise, and professional. All figures are referenced and integrated seamlessly into a compelling narrative.
Oral Presentation (1-3) Unprepared, missing key project components, or unable to answer basic questions. (4-5) Disorganized, unclear, or reads directly from slides. Figures are difficult to read (e.g., "data-dump" plots). Struggles to answer questions. (6-8) Clear, organized, and well-timed presentation that covers all required topics. Figures are legible. Answers audience questions correctly and confidently. (9-10) Polished, professional, and engaging. Uses clear visuals to tell a compelling scientific story. Answers audience questions with depth and insight.

References and Resources

A code for students  to customize and adapt for their need

%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%  PURPOSE: This script loads a WOCE NetCDF file, calculates derived
%           physical variables (sigma-0, N^2), and analyzes the
%           relationship between physical stratification (pycnocline) and
%           biological features (DCM, Oxygen).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

clear; clc; close all;

%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%  PHASE 1: SETUP & DATA LOADING
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% --- 1a. Add GSW Toolbox Path ---
% You MUST change this to the directory where you unzipped the GSW toolbox
gsw_toolbox_path = 'C:\Users\YourName\Documents\MATLAB\gsw_matlab_v3_06_16';
addpath(genpath(gsw_toolbox_path));

% --- 1b. Check for GSW Toolbox ---
if ~exist('gsw_SA_from_SP', 'file')
   error(['GSW Toolbox not found. Please edit "gsw_toolbox_path" ' ...
          'in this script to point to your GSW toolbox directory.']);
end

% --- 1c. Define File and Load Data ---
netcdf_file = 'woce_data.nc'; % CHANGE THIS to your NetCDF file name

% --- 1d. Inspect the file (run this once to learn your variable names) ---
% ncinfo(netcdf_file)

% --- 1e. Load Core Variables ---
% ASSUMPTION: 'pres' is 1D (n_levels x 1)
% ASSUMPTION: 'lat' and 'lon' are 1D (1 x n_casts)
% ASSUMPTION: 'temp', 'psal', etc. are 2D (n_levels x n_casts)
fprintf('Loading data from %s...\n', netcdf_file);
try
   pres = ncread(netcdf_file, 'pres'); % Pressure (dbar)
   lat = ncread(netcdf_file, 'lat');   % Latitude
   lon = ncread(netcdf_file, 'lon');   % Longitude
   
   psal = ncread(netcdf_file, 'psal'); % Practical Salinity (psu)
   temp = ncread(netcdf_file, 'temp'); % In-situ Temperature (degC)
   doxy = ncread(netcdf_file, 'doxy'); % Dissolved Oxygen (umol/kg)
   chla = ncread(netcdf_file, 'chla'); % Chlorophyll (mg/m^3)
catch ME
   disp('ERROR: Could not read NetCDF variables.');
   disp('Please check the "netcdf_file" path and variable names inside the script.');
   rethrow(ME);
end

% Get dimensions for loops
[num_levels, num_casts] = size(psal);
fprintf('Data loaded: %d casts and %d pressure levels.\n', num_casts, num_levels);

% --- 1f. Create 2D Coordinate Matrices for GSW functions ---
% GSW functions need lat/lon for *every* data point (i.e., a 2D matrix)
[lon_2d, pres_2d] = meshgrid(lon, pres);
[lat_2d, ~]       = meshgrid(lat, pres);


%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%  PHASE 2: CALCULATE DERIVED PHYSICAL VARIABLES
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
fprintf('Calculating derived variables...\n');

% --- 2a. Calculate Absolute Salinity (SA) and Conservative Temp (CT) ---
SA = gsw_SA_from_SP(psal, pres_2d, lon_2d, lat_2d);
CT = gsw_CT_from_t(SA, temp, pres_2d);

% --- 2b. Calculate Potential Density Anomaly (sigma-0) ---
% This is the density referenced to the sea surface (0 dbar)
sigma0 = gsw_sigma0(SA, CT);

% --- 2c. Calculate Buoyancy Frequency (N^2) ---
% This is the key variable for stratification!
% N^2 is calculated at the midpoints between pressure levels.
[N2, p_mid] = gsw_Nsquared(SA, CT, pres_2d, lat_2d);
% N2 will have one less level (row) than sigma0


%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%  PHASE 3: PLOTTING - STACKED VERTICAL PROFILES
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Plot a single representative cast to see the vertical structure
fprintf('Generating profile plots...\n');

cast_to_plot = round(num_casts / 2); % Plot the middle cast

figure('Name', ['Vertical Profiles for Cast ' num2str(cast_to_plot)]);

% Panel 1: Sigma-0 (Density)
subplot(1, 4, 1);
plot(sigma0(:, cast_to_plot), pres);
set(gca, 'YDir', 'reverse'); % Puts 0 dbar (surface) at the top
xlabel('Potential Density (\sigma_0, kg/m^3)');
ylabel('Pressure (dbar)');
title(['Cast ' num2str(cast_to_plot)]);
grid on;

% Panel 2: N^2 (Stability)
subplot(1, 4, 2);
plot(N2(:, cast_to_plot), p_mid); % Note: use p_mid for N2
set(gca, 'YDir', 'reverse');
xlabel('Buoyancy Freq. (N^2, s^{-2})');
title('Stability (Pycnocline)');
grid on;

% Panel 3: Chlorophyll
subplot(1, 4, 3);
plot(chla(:, cast_to_plot), pres);
set(gca, 'YDir', 'reverse');
xlabel('Chlorophyll (mg/m^3)');
title('Chlorophyll (DCM)');
grid on;

% Panel 4: Dissolved Oxygen
subplot(1, 4, 4);
plot(doxy(:, cast_to_plot), pres);
set(gca, 'YDir', 'reverse');
xlabel('Dissolved Oxygen (\mu mol/kg)');
title('Oxygen');
grid on;


%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%  PHASE 4: PLOTTING - 2D TRANSECT CONTOUR PLOTS
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Visualize the entire cruise track
fprintf('Generating 2D transect plots...\n');

% --- 4a. Plot Sigma-0 Transect ---
figure('Name', 'Transect: Potential Density');
contourf(lon, pres, sigma0, 20, 'LineStyle', 'none'); % 20 contour levels
set(gca, 'YDir', 'reverse');
datetick('x', 'keeplimits'); % Use if lon is datenum, otherwise just use xlabel
xlabel('Longitude');
ylabel('Pressure (dbar)');
title('Potential Density (\sigma_0) Transect');
colorbar; % Adds the color legend
cmocean('dense'); % Optional: for better colormaps (download from File Exchange)


% --- 4b. Plot N^2 (Stability) Transect ---
figure('Name', 'Transect: Buoyancy Frequency (N^2)');
% We plot N2 vs p_mid. We also limit the color scale with 'clim'
% because N^2 can have extreme surface values.
h = pcolor(lon, p_mid, N2);
set(h, 'EdgeColor', 'none'); % Removes grid lines from pcolor
set(gca, 'YDir', 'reverse');
xlabel('Longitude');
ylabel('Pressure (dbar)');
title('Buoyancy Frequency (N^2) Transect');
colorbar;
cmocean('balance');
clim([-1e-4 1e-4]); % Adjust this range to see features


% --- 4c. Plot Chlorophyll Transect (with Density Overlay) ---
figure('Name', 'Transect: Chlorophyll & Density');
contourf(lon, pres, chla, 20, 'LineStyle', 'none');
set(gca, 'YDir', 'reverse');
xlabel('Longitude');
ylabel('Pressure (dbar)');
title('Chlorophyll (color) & Density (lines) Transect');
h_bar = colorbar;
ylabel(h_bar, 'Chlorophyll (mg/m^3)');
cmocean('algae');
clim([0 1.5]); % Adjust this color limit to highlight the DCM

% Hold on to overlay density contours
hold on;
% Plot isopycnals (lines of constant density)
isopycnals = 26:0.2:28; % Define which density lines to draw
[C, h] = contour(lon, pres, sigma0, isopycnals, 'k-'); % 'k-' is black solid line
clabel(C, h, 'Color', 'w'); % Label the contour lines in white
hold off;


% --- 4d. Plot Dissolved Oxygen Transect ---
figure('Name', 'Transect: Dissolved Oxygen');
contourf(lon, pres, doxy, 20, 'LineStyle', 'none');
set(gca, 'YDir', 'reverse');
xlabel('Longitude');
ylabel('Pressure (dbar)');
title('Dissolved Oxygen (\mu mol/kg) Transect');
colorbar;
cmocean('oxy');


%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%  PHASE 5: FEATURE EXTRACTION (PYCNOCLINE & DCM DEPTHS)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% This is the core analysis: numerically find the depths of the main
% features for every single cast.
fprintf('Extracting feature depths...\n');

% Pre-allocate vectors to store the depths
pycnocline_depth = NaN(1, num_casts);
DCM_depth        = NaN(1, num_casts);

for i = 1:num_casts
   % Get the profile for this one cast
   N2_profile = N2(:, i);
   Chl_profile = chla(:, i);
   
   % Find the depth of the pycnocline (max N^2)
   % We limit the search to below 10m to avoid surface noise
   valid_N2_indices = find(p_mid > 10);
   if ~isempty(valid_N2_indices)
       [~, max_N2_idx] = max(N2_profile(valid_N2_indices));
       % Store the pressure (depth) of that max value
       pycnocline_depth(i) = p_mid(valid_N2_indices(max_N2_idx));
   end
   
   % Find the depth of the DCM (max Chlorophyll)
   % We limit the search to below 10m
   valid_Chl_indices = find(pres > 10);
    if ~isempty(valid_Chl_indices)
       [~, max_Chl_idx] = max(Chl_profile(valid_Chl_indices));
       % Store the pressure (depth) of that max value
       DCM_depth(i) = pres(valid_Chl_indices(max_Chl_idx));
    end
end


%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%  PHASE 6: SYNTHESIS & INTERPRETATION PLOTS
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
fprintf('Generating synthesis plots...\n');

% --- 6a. Correlation Plot: Pycnocline Depth vs. DCM Depth ---
figure('Name', 'Synthesis: Pycnocline vs. DCM Depth');
scatter(pycnocline_depth, DCM_depth, 'filled');
hold on;

% Add a 1:1 line
max_depth = max([pycnocline_depth, DCM_depth]);
plot([0 max_depth], [0 max_depth], 'r--', 'LineWidth', 2); % Red dashed line

xlabel('Pycnocline Depth (dbar)');
ylabel('DCM Depth (dbar)');
title('Correlation of Feature Depths');
legend('CTD Casts', '1:1 Line', 'Location', 'northwest');
grid on;
axis equal; % Makes the 1:1 line visually correct


% --- 6b. Property-Property Plot: Chl vs. Stability (N^2) ---
figure('Name', 'Synthesis: Chlorophyll vs. Stability');

% Use (:) to "unroll" the 2D matrices into 1D vectors for plotting
scatter(N2(:), chla(:), 10, 'filled', 'MarkerFaceAlpha', 0.1);

xlabel('Buoyancy Freq. (N^2, s^{-2})');
ylabel('Chlorophyll (mg/m^3)');
title('Chlorophyll vs. Water Column Stability');
% Set x-axis limit to see the main cluster (ignore extreme surface N^2)
xlim([-1e-4 5e-4]); 
grid on;