# How do I use probability to predict geologic events?

*Probability in the Earth sciences*

## An introduction to probability

What is the likelihood that there will be a major earthquake in my town during my life?

What are the chances that my home will be impacted by a category 5 hurricane this year?

To answer these types of questions, we need probability. We use probability to quantify the likelihood that an event will occur. This method is particularly useful for quantifying the likelihood of hazards such as earthquakes and floods, or for determining the likelihood of success when probing the Earth for natural resources. This module will introduce the mathematic procedures needed to calculate the probability of geoscientific events.

## Types of problems in this module

- Determine the
**probability of occurrence.**This type of problem is used when you want to determine the probability that event will occur, such as the probability of having a high category hurricane in a given year, or the probability that a mine will yield valuable minerals. Based on the historical record, what's the probability of having a major flood? - Make predictions about the
**probability of occurrence over an interval.**This type of problem is used when you want to determine the likelihood that there will be a major earthquake over a 30 year period, or that at least one of 20 boreholes will strike oil. For example, what's the probability of having at least one category 5 hurricane in the next 10 years? - Determine the
**interval needed to achieve a certain probability.**This type of problem is used when you want to know what length of interval will yield a certain probability or risk level. For example, over what time period can you expect to have a 99% chance of experiencing a major drought? How many rocks do you need to analyze to have a 50% chance of finding a meteorite?

## How do I determine the **probability of occurrence** for geologic events from data?

What is the likelihood of a major earthquake, flood, meteor impact, or finding an economic mineral deposit to fuel the energy transition? To make predictions about geological events such as earthquakes, storms, landslides, and finding valuable minerals, we must first use existing observations to find the probability of these events. To do so, we need a set of observations and an outcome of interest, which we define. The set of observations could be the magnitude of the largest earthquake each year in a region or the compositions of rocks recovered from mining boreholes. The outcome of interest could be having a damaging earthquake (magnitude >7) in a given year, or finding gold in a given borehole. We then divide the number of outcomes of interest by the total number of observations:

`"Probability of occurrence" = ("outcome of interest")/("total number of observations")`

The probability of the outcome of interest is thus the proportion of observations that result in the outcome of interest. For example, in the last 150 years, there have been 23 years with a magnitude $\geq$7 earthquake in California. Using the equation above, we can determine the probability of having a magnitude $\geq$7 earthquake in a given year using the following steps:

**Step 1.** Determine the type of problem.

**Step 2.** Determine the probability of occurrence of the outcome of interest.

*How to identify probability of occurrence problems: You can identify this type of problem by the fact that it asks about a single observation. For example, the probability of an earthquake in a single year, or the probability of finding gold in a single borehole.*

**A few notes about probability:**Probability values range from 0 to 1. We can express a probability in terms of percent, which is intuitive for most people. For example, a probability of 0.5 is equivalent to 50%. However, when we perform calculations that involve probability, it is important to use the standard probability value between 0 and 1. If we use percentages, the calculations will not work.

Probabilities are often expressed using the notation $P(\text{some specific event or outcome})$, which can be read, "the probability of some specific event or outcome." For example, $P(\text{M }\geq\text{7 earthquake in one year})$ can be read, "the probability of a magnitude $\geq$7 earthquake in one year."

## How do I make predictions about the **probability of occurrence over an interval**?

In the Earth sciences, we often want to determine the likelihood that an event will occur within some given time interval. For example, what are the chances of having a magnitude $\geq$7 earthquake in the next 30 years? We can use calculated probabilities, together with several mathematical techniques, to make these sorts of predictions accurately.

### Two important probability tools

To make these predictions, we will make use of two rules of probability: the complement rule and the exponent rule. These rules serve as tools that will help us answer the questions above.

The **complement rule** states that the probabilities of all possible outcomes must sum to 1. This means that for an outcome of interest:

For the earthquake example above, where the probability of a magnitude $\geq$7 earthquake in a given year is 0.15, the probability of not having a magnitude $\geq$7 earthquake is 1 - 0.15 or 0.85.

The **exponent rule** relates to the probability of multiple independent events of equal probability. For events to be independent, one event occurring does not affect the probability of the other. This is true for the Earth science problems we consider here. The probability of multiple independent events is the product of the probability of each event. Therefore, if the probability of each event is equal, the total probability equals the probability of each event to the power of the number of events.

For the earthquake example, we can use the exponent rule to find the probability of having magnitude $\geq$7 earthquakes in multiple years. For example, where the probability of a magnitude $\geq$7 earthquake in a given year is 0.15, the probability of having a magnitude $\geq$7 earthquake in each of the next 3 years would be 0.15^{3} = 0.0034. If we wanted to compute the probability of having a large earthquake every year over a 10 year period, we would calculate 0.15^{10 }= 5.7 * 10^{-9}.

With these two rules, let's move on to predicting geologic events.

### Calculating the probability of an occurrence over a defined interval

As Earth scientists, we are often interested in the probability of an event over some defined interval. This could be a time interval, a spatial interval, or simply a set of experimental observations. For instance, how likely is it that my house will flood during the course of my 30-year mortgage? How likely is it that a new, gloriously cheesy geoscience disaster movie will be released in the next 10 years? To solve these types of questions, we will work through an example.

**What are the chances of having a magnitude $\geq$7 earthquake in California in the next 30 years?**

We can use the probability rules introduced above to break these questions down in terms of the probability of the outcome of interest.

The series of steps below outlines how to solve this type of problem.

**Step 1.** Determine the type of problem.

**Step 2.** Determine the probability of occurrence of the outcome of interest.

**Step 3.** Determine the probability of a non-occurrence outcome using the complement rule.

**Step 4.** Determine the probability of non-occurence over the interval using the exponent rule.

**Step 5.** Determine the probability of at least one occurrence over the interval using the complement rule.

*How to identify probability of occurrence over an interval problems: These questions ask about the probability of an outcome of interest over a definite number of observations, such as the probability of observing at least one flood in an area over 10 years.*

## Calculating the **interval needed to achieve a certain probability**

It can also be useful to determine the interval it would take before a certain probability is reached. For instance, you might be interested in knowing how many mining claims you would need to establish to have an overall >50% probability of finding economic mineral deposits. Or perhaps you are a storm chaser and want to know how many days you should plan for your vacation to have a >50% probability of observing a tornado. Let's work through an example.

**Over how many years is there an overall >50% chance of experiencing a magnitude $\geq$7 earthquake in California? **

This type of problem is similar to finding the probability of an event over a definite interval, and we will again make use of the complement rule and the exponent rule. The main difference is that in this type of problem, we are given a probability (in this case, 50% or 0.5) and are solving for the length of the interval, which we denote using the variable N . To solve **interval needed to achieve a certain probability** problems, we use the exponent rule slightly differently:

From here, we can find how to solve for N.

The general equation we can use to solve for $N$ in this type of problem is

$N = log_{P(\text{non-occurrence outcome})}(1 - P(\geq\text{1 outcome of interest over interval N}))$

To solve this type of problem, use the following steps:

**Step 1.**Determine the type of problem.

**Step 2.** Determine the probability of occurrence of the outcome of interest.

**Step 3.** Determine the probability of a non-occurrence outcome using the complement rule.

**Step 4.** Plug the probability of non-occurrence over an interval and the probability of interest into the logarithm equation we found above.

**Step 5.** Evaluate the logarithm to find N.

*How to identify interval needed to achieve a certain probability problems: These questions will ask for the interval over which there is a specified probability of observing at least one outcome of interest. Importantly, the length or size of the interval is what is being solved for; it is not given by the problem. *

## Earth sciences application: Flood probability and prediction

In this problem, we want to predict the likelihood of high magnitude flood events, which is important for preparing impacted communities. Flood magnitude is often measured in cubic feet per second (cfs), reflecting the river discharge or the volume of water that passes a point in a given period of time. The plot above shows the maximum (or peak) annual discharge for Tymochtee Creek, Ohio, from 1961 to 2023. You can see that only a few years had flood events that exceeded 7,000 cfs discharge, whereas many years had flood events that exceeded 2,000 cfs discharge. In general, larger floods are rare and smaller floods are relatively common.

Let's imagine that you are a planner assessing flood hazards for the community of Crawford, Ohio, located near Tymochtee Creek. You are tasked with determining some probabilities related to flood magnitudes to guide future development of the community. Assume that discharge over 7,000 cfs corresponds with damaging flooding in this area. The 10 largest peak annual discharges from the 63 year record are shown below.

**Your supervisor wants to know:**

**(A) What is the probability of having a flood event that is greater than 7,000 cfs in a given year?**

**Step 1.** Determine the type of problem.

**Step 2.** Determine the probability of occurrence.

**(B) What is the probability of observing at least one damaging flood ($\geq$7,000 cfs) over 10 years?**

We can break this problem into several steps.

**Step 1.** Determine the type of problem.

**Step 2.** Determine the probability of occurrence.

**Step 3.** Determine the probability of non-occurrence using the complement rule.

**Step 4.** Determine the probability of non-occurrence over an interval using the exponent rule.

There is a 61% chance that there will be no damaging flood on Tymochtee Creek in 10 years.

**Step 5.** Determine the probability of occurrence of at least one event using the complement rule.

**(C) How many years is the period of time over which the probability of observing at least one damaging flood is greater than $0.5$?**

**Step 1.** Determine the type of problem.

**Step 2.** Determine the probability of the event of interest.

**Step 3.** Determine the probability of non-occurrence using the complement rule.

**Step 4.** Plug the probability of non-occurrence over an interval and the probability of interest into the logarithm equation.

**Step 5.** Evaluate the logarithm to find N.

## Where do you use probability in Earth science?

- Flooding, hurricane, and drought prediction
- Earthquake risk analysis
- Mineral exploration
- Geochronology
- Sea level rise scenarios

## Next steps

## More help (resources for students)

*Pages written by Emma MacKie (University of Florida) and Alex Tye (Utah Tech University).*