Linear Regression - Practice Problems

Solving Earth science problems with regression


Working with ecological data

Ecology explores relationships between organisms and other living (biotic) things or nonliving (abiotic) components in their environment. Certain factors may impact the abundance, distribution, or physiology of organisms, including abiotic factors such as temperature, moisture, or sunlight, and biotic factors such as the presence of predators or competitors. Linear regressions can be used to quantify these relationships and predict organism responses to various levels of the abiotic or biotic factor.

Problem 1: Ecologists tested the grazing pressure of green crabs on clams. They constructed `1 m^2` cages and planted 300 clams in each cage. Two days later they counted the number of remaining clams and recorded the data in the data below.

Number of Crabs Number of Clams Remaining
2 137
4 70
2 184
5 0
4 35
0 297
3 122
5 1
1 253
3 150

Problem 1A: First, perform a linear regression using the step-by-step instructions for calculating `m` (slope) and `b` (intercept) of the regression line. What is the full equation for the regression line that you calculated?

Problem 1B: Next, run the linear regression statistics using Excel's Data Analysis Toolpak. Do the values given for `m` and `b` match the values that you calculated in Part A?

Determining a standard (calibration) curve

Earth scientists often try to measure concentrations of chemicals in waters, soils, sediments, rocks, or biological things. Often what is measured must be compared to known samples, called standards. This is accomplished by creating a standard (calibration) curve. Standard curves are not usually curves! They are graphs of data points of measurements from an instrument (on the y axis) based on known concentrations of chemicals in various samples. The concentrations and measurements will ideally have a linear relationship, which can be determined by a linear regression. The line equation can then be used to figure out the concentrations of unknown samples that are analyzed by your instrument.

Problem 2A: You want to analyze some stream water samples for copper to see if an active mine is affecting the water quality. You can use an instrument called an atomic absorption spectrometer (AAS) with a light wavelength of 420 nm for this. You create 4 standards with known amount of copper in them. The AAS then measures how much light is absorbed by each standard. Beer's Law states that the amount of light absorbed (absorbance) is linearly related to the concentration of copper in each standard and sample.

The table below shows the data from your AAS. Analyze the data with a linear regression to determine the line equation for your standard curve.

Concentration (mg/L)         Absorbance   
0 0.003
0.2 0.033
0.4 0.065
0.6 0.098
0.8 0.125


Problem 2B: You analyzed two water sample for copper with your spectrophotometer. The absorbance for the Rabbit Run stream water is 0.114 and the absorbance for the Mill Creek stream water is 0.078. What are the copper concentrations in these samples?

Geochemical variation diagrams

Harker diagrams are geochemical variation diagrams commonly used in Earth science to represent the chemical constituents in a rock as a proportion of silica `(SiO_(2))`. Some of these relationships are linear and can be represented with linear regression.

Problem 3: The figure to the right shows some examples of Harker Diagrams from Montserrat, Lesser Antilles volcanic arc. You are given some of the data for `CaO` and `SiO_(2)` in the table below. Units are weight percent (wt %).

SiO2 (wt %) CaO (wt %)
52 8
69 2
56 7
53 8
62 5
74 1
60 5
53 9
47 12
55 9

Problem 3A: First, perform a linear regression using the step-by-step instructions for calculating `m` (slope) and `b` (intercept) of the regression line. What is the full equation for the regression line that you calculated?

Problem 3B: Next, run the linear regression statistics using Excel's Data Analysis Toolpak. Do the values given for `m` and `b` match the values that you calculated in Problem 3A?

Changes over time

Many things change over time- trees get taller, the earth's plates move, water evaporates. But are these changes linear? If these things change by a constant amount with each time period, then there's a linear correlation with time as the independent variable. If it increases consistently over time, that's a positive correlation; if it decreases consistently over time, that's a negative correlation. Linear regression is used to model these changes over time- and to help us make predictions about past or future time points.

Problem 4A: When glaciers retreat, they leave behind bare land with little to no soil left. Often loose, unconsolidated material called till is left behind and it will slowly become soil. About 11,000 years ago, glaciers retreated from Wisconsin, and soil has been forming since then. In the year 2000 in one remote area, the soil thickness was measured to be 32 inches. Scientists used carbon-14 dating to estimate that the soil thickness after 5000 years was 13 inches and increased to 30 inches after 10,000 years. Using these four time points (call the glacial retreat time 0 and today 11,000 years), determine the line equation that relates time and soil thickness.

Problem 4B: How long does this model predict it would take to form a new inch of soil in this area of Wisconsin?


Next steps

If you feel comfortable with this topic, you can go on to the assessment.

Or you can go back to the Linear Regression explanation page.