Learning to Learn from Data

Kim Kastens

published Feb 15, 2011

Scientists learn from data. Learning to learn from data is obviously an essential aspect of the education of a future scientist.

These days, however, many other kinds of people also learn from data--including business people, investors, education leaders, and people who care about pollution, disease, or the quality of their local schools. My daily newspaper is rich in data-based graphs and maps--and so is the newsletter from my local library. These days, learning to learn from data is a necessary part of everyone's education.

However, learning to learn from data is not a typical part of everyone's education. This post explores what might be required to construct a thorough learning progression for learning from Earth Science data, beginning where a good elementary school leaves off and carrying on through to what an upper level college course or adult job might demand.

A learning progression is a "hypothesized description of the successively more sophisticated ways [that] student thinking about an important domain of knowledge or practice develops as children learn about and investigate that domain over an appropriate span of time" (Cocoran, et al, 2009). Neither I, nor anyone else, is ready to flesh out a full-fledged, research-supported learning progression about learning from data, but these thoughts from my December 2011 AGU talk pin down aspects of what are sometimes called the "lower anchor" and "upper anchor" of such a learning progression: the learning performances that exemplify the beginning and end of the progression.

In a good elementary or middle school science program, students have opportunities to get out in nature and collect data themselves about their local environment. Classic data collection opportunities for kids include making weather observations with hand-held instruments, or as shown below (left), making measurements in their local stream or estuary. By the time they get to college, however, students are expected to interpret professionally collected data, from the treasure trove of data freely provided via the Internet by data-rich agencies and universities.

student-collected to professionally-collected data

This distinction is important because when people collect data themselves, they have a chance to pick up a deeper understanding of the process by which this particular aspect of Nature was turned into numbers, and in particular what some limits might be on the validity of the data. In the process of making "first inscriptions," they can develop an embodied, holistic sense of the setting or environment from which the data were extracted, and then draw on this understanding when it comes time to interpret what earth processes caused their data set to be the way it is. When working from data collected by others, a sense of the data-acquistion process has to come indirectly from bloodless metadata and a theoretical understanding of the instrumentation used.

When kids collect data themselves, they collect a few dozen or maybe a few hundred data points. They can create an appropriate data display by hand, with pencil and paper. College-level manipulation of professionally-collected data can involve millions of data points, impossible to contemplate one by one. Data visualization software of one sort or another comes into play, and each data visualization tool comes with a learning curve. In my opinion, the hardest part is not learning to manipulate that software to make the appropriate display, but rather learning to see the display in terms of trends and processes rather than as dots, wiggles or blotches of color.

Kid's data interpretation activities tend to focus on one data set and one data type at a time. Data interpretation tasks faced by college students and adults frequently involve two or more data sets, which may be of varied types. Questions about the interactions between the aspects of reality represented by the two or more data sets call for a new set of skills, both logical and statistical, to sort through the range of possibilities: are A and B related? might A be causing or influencing B? might B be causing or influencing A? might some other C be causing or influencing both?

When kids are first taught about graphs, they are typically taught that graphs are useful for looking up stuff. This skill can be taught and learned in a cookbook fashion (below, left). Q: "What was the salinity of the Hudson River at the Beczak Station at noon on April 16?" A: Go across the horizontal axis until you find noon on April 16. Go up until you hit the data line. Go across until you hit the vertical axis. Read off the value: 7000ppm. A harder-to-teach skill is to interpret the pattern or trend of a graph taken in its entirety (below, right).

In the kinds of data interpretation activities set up for kids, typically they can use the kinds of common sense lines of reasoning that work for them in every day life. In college or adulthood, the requisite lines of reasoning become more complicated, involving multi-step chains of reasoning, and drawing on temporal, spatial, quantitative, and systems thinking.

In summary then, we find that learning to learn from data at a sophisticated level is a complicated cognitive challenge, encompassing many sub-challenges.

summary of learning progression elements

In reflecting back on my own education, I don't know how this transformation was accomplished. I don't remember being explicitly taught any of this in an earth science course. It seems like I picked it up by osmosis, or maybe by trial and error--but that seems unlikely. How did you learn to learn from data?

See also: Kastens, K. A., and Turrin, M., 2010, Earth Science Puzzles: Making Meaning from Data: Washington, D.C., National Science Teachers Association, 186 p. Available from NSTA Press bookstore.

References:

Clement, J., 2002, Graphing, in Lehrer, R., and Schauble, L., eds., Investigating Real Data in the Classroom: New York, Teachers College Press, p. 63-73.

Cocoran, T., Mosher, F. A., and Rogat, A., 2009,Learning Progressions in Science: An Evidence-based Approach to Reform: Center on Continuous Improvement of Instruction, Teachers College.

Kastens, K., and Turrin, M., 2011, Geoscience data puzzles: Developing students' ability to make meaning from data, Abstract ED11C-04, in 2011 Fall Meeting, AGU San Francisco, Calif., 13-17 December.

Wainwright, S., 2002, Shadows, in Lehrer, R., and Schauble, L., eds., Investigating Real Data in the Classroom: New York, Teachers College Press.

Learning to Learn from Data --Discussion

Martin Farley
Mar, 2011

I guess I'm not going to answer your question exactly, because I doubt my experience is relevant to the majority of students in the region where I now live and to many students nationally. I am the son of a scientist and went to school in an affluent suburb of Philadelphia which prided itself on its well-funded (and good) public schools. I now live in a rural area of North Carolina, which by many measures is the poorest in the state. The local public schools have never been known as outstanding and are still weak in science. It is common that the required high school earth science is taught by teachers who have never had even a single college-level geology class. My experience with science education majors and in-service teachers taking graduate classes in science ed is that their math and graphical skills are weak.

Here's what I'll say about my K-12 and later experience. I didn't collect lots of data myself especially before high school, but I did make lots of graphs. Then I went on making lots of graphs in college.

Learning from graphs is certainly very important (as you say, anyone can use this in life) and I have a series of exercises I use in my classes, especially gen ed classes that have "Let's graph some data!" as a theme. These range from simple to complex. My experience there is that a large fraction of my college students (dominantly from the regional public schools) have not done much of this before and most will deny ever having seen log-scale graph paper. This means we have big challenges to go in K-12 education.

Can I sum this up? Make graphs, keep making graphs, learn more about making graphs (I recommend W.S. Cleveland's Elements of Graphing Data), make more graphs.

4049:13831

Bill Prothero
Mar, 2011

This post was edited by Caroline Kralovec-Kirchherr on May, 2018
I learned how to reason from data by going to college, majoring in Physics, then graduate school in physics. The best lab class I had was an electricity and magnetism lab where I had to be especially careful to determine measurement errors and understand how my measuring equipment worked.

As I taught Earth Science at UCSB, I always wished I could get students involved with real Earth data and, finally, computer technology allowed me to do this. I also observe that for many kinds of data investigations, the collection of data seems to dominate, and the analysis and effort of relating it to a larger science picture is not given the attention it deserves.

For my introductory oceanography class at UCSB, I selected plate tectonics to develop my data exploration activities, and later expanded to the monsoon, climate, and global fisheries. I developed a collaboration with Greg Kelly, a science education prof in UCSB's graduate school of education. This collaboration was very useful in identifying problems students had in creating a science argument.

If you want to know more about the materials and ideas I developed for my oceanography class at UCSB, go to:

es.earthednet.org/
and
es.earthednet.org/node/32

4049:13855

Sandra Swenson
Mar, 2011

My personal experience in learning from data, I am pretty sure, was like most other students' experience. That is, we learned about data and graph interpretation in math class with examples using some variable vs. time (line graph) or chunks of information in a given quantity (bar graph). I do not remember looking at too many graphs or charts in science class. I had a similar experience with learning probability and statistics. That is, none of what I learned was relevant to science or research in science education. It was all about rolling die and something about the probability of the color of socks a girl was going to next pick out of her drawer.

I am trying to develop curriculum for non-science majors using data maps and I have found that data maps are just as confounding for students to interpret as common graphs (which are not so common, unless you use them all of the time). Martin Farley (above comment) states clearly that unless students have practice - a lot of practice - they are not going to become proficient in interpreting data. Bill Prothero's website looks excellent and I look forward to using some of his ideas in my course.

According to Bloom, the interpretation of a communication (e.g. a data map) first begins with translation of the communication. Translation is the bridge between requisite knowledge and abstract ideas. My concern is that students' requisite knowledge may not be robust enough (or perhaps they have just plain forgotten) to communicate effectively about abstract interpretations. I think that every time a scientific visualization is presented for the first time in a class, instructors should not make the assumption that students understand the data sets or how these data sets are represented. In other words, don't assume that what was taught in math will translate to a usable skill in science.

4049:13943

E. Christa Farmer
Mar, 2011

This post was edited by Sarah Nakamoto on Nov, 2020
Thank you Kim, for exploring this topic! Helping students make this leap is so important in enabling students to become educated citizens, something that scientist/educators talk so much about... but not so much from this particular crucial angle.

I'm not so sure I'm particularly *good* at interpreting data, but the pivotal experiences that I can point to that have helped me get to where I am (other than a very good grounding in the basics during grade school and college) are a good REU program, being a teaching assistant, and making graph after graph after graph in graduate school.

I was actually an "Earth Systems-Biosphere" major with a strong interest in biogeochemistry and ecology as an undergraduate. We were required to do some kind of internship, and I was lucky enough to get help from my advisor so that I could participate in a Research Experience for Undergraduates program at the University of Colorado at Boulder in the summer between my junior and senior years. What was so valuable (and frustrating, at the time) about this experience was that I was basically turned loose with an idea and some resources and a very hands-off approach by my advisor. I not only collected my own data
(measuring the emission of Nitrous Oxide from soils after they were wet with fresh and nitrate-rich "rainwater"), I sat down and taught myself Excel and tried to figure out what it meant. I think this was the first time I actually appreciated calculus and rates of change, although I wouldn't have been able to say that at the time!

When I went back to graduate school, I was talked into being a Teaching Assistant for a terrific undergraduate class, the V2100 "The Climate System" course at Columbia University (http://xtide.ldeo.columbia.edu/mpa/Clim-Wat/default.htm ). It's my understanding that just before I helped with that course (in 2000 or so), several professors from Columbia had gotten and implemented an NSF grant to improve the course. I don't think I had ever taken a course like it as an undergraduate- there were some lab sessions where the students performed experiments, and some lab sessions where students analyzed big existing datasets. I don't know for sure, and maybe you do since you taught a course in that series, but I think the course design explicitly tried to address the kinds of skills development that you discuss in your post. I do know that there was a careful progression of assignments that started with interpreting graphs, then making simple graphs, then interpreting rates of change or other causal mechanisms from those graphs. I learned so much from that experience of mindfully helping the students work their way through many of the same steps that I had taken myself through during that summer working on biogeochemistry: plotting points, fitting lines, estimating equations to fit them, and linking those equations to processes. I hope the students learned as much as I did!

Of course I probably learned the most about this process while researching and writing my doctoral dissertation! It's hard for me to break down the incremental steps of that process that contributed to my understanding of how to learn from data, although I can point to several data-rich classes (like Marc Spiegelman's "Myths and Methods in Modeling" and Doug Martinson's statistics class, and Dave Walker's Field Methods Class), and the act of (again) collecting and trying to interpret my own data. This leads me to wonder whether we can actually entrain this process into any kind of pre-programmed learning experience, or whether students need multiple strands in their education: all those canned math classes where it was sometimes hard to see the point, all the times I struggled to collect and analyze my own data, even all the teaching experiences where I have struggled to distill that process for students- I think all have contributed equally to whatever skills I have in this area. Can I reproduce all of these in a single introductory earth science course for non-majors? It's something to strive for, anyway.

4049:13977

Join the Discussion

« Does "form follows function" apply in geosciences? What could the President do? »