Is the Fourth Paradigm Really New?

Kim Kastens
published Oct 20, 2012

Cover of Fourth Paradigm I have a long-standing interest in the use of data in education, so I've been reading with interest several articles and a book concerned with the so-called "Fourth Paradigm" of science, in which insights are wrested from vast troves of existing data. The Fourth Paradigm is envisioned as a new method of pushing forward the frontiers of knowledge, enabled by new technologies for gathering, manipulating, analyzing and displaying data. The term seems to have originated with Jim Gray, a Technical Fellow and visionary at Microsoft's eScience group, who was lost at sea in 2007. The first three paradigms, in this view, would be empirical observation and experimentation, analytical or theoretical approaches, and computational science or simulation. Earth and Environmental Sciences are well represented in the book, with essays on data-rich ecological science, ocean science, and space science.

I am finding these readings very stimulating and worthwhile. But I question whether this way of making meaning from the complexity of nature is really so new. I recently heard Walter Pitman talk about the early days of testing the hypothesis of seafloor spreading using magnetic anomalies. In 1965, Pitman and colleagues collected a magnetic anomaly profile on cruise Eltanin 19 that captured a near-perfect symmetrical set of normal and reversed magnetic anomalies across the East Pacific Rise (Pitman & Heirtzler, 1966). This so-called "magic profile" persuaded many skeptics of the plausibility of the seafloor-spreading hypothesis. Pitman told our group that within a year after the Eltanin 19, he and his collaborators had nailed the same pattern around the world's oceans. That expansion wasn't done by acquiring new data, but by mining the existing archive with newly insightful eyes–a Fourth Paradigm inquiry 45 years before the term was coined.

data that Atwater has access to A few years after Pitman completed this work, Tanya Atwater undertook a masterful re-interpretation of the tectonic history of western North America. She used the seafloor spreading anomaly pattern of the northeast Pacific to reconstruct the sequence of events by which a corner of the Pacific Plate smashed into the edge of North America. The consequent change in relative plate motion along western North America shut down subduction-related vulcanism and compression and initiated strike slip motion along the San Andreas Fault. She synthesized and combined insights from many scientists' geological mapping on land and geophysical surveying at sea–another "Fourth Paradigm" inquiry.

Fourth Paradigm science is especially powerful for young investigators. Both Pitman and Atwater were graduate students when they did the pioneering work described above. As students they had access to archives of data maintained by their institutions, Lamont Geological Observatory and Scripps Institution of Oceanography respectively. The geological maps that Atwater tapped into were freely published and available to all. There is no way they could have done their work in a hypothesis-driven data acquisition mode. Can you imagine the reaction of a program manager to a query like: "I have this really great idea that it should be possible to re-interpret the tectonic history of western North America by looking at the history of plate interactions recorded in the adjacent seafloor, and all I'll need is ten ship-years of geophysical survey time plus a hundred geologist-lifetimes of field mapping...." No way.

Atwater's and Pitman's synthesis work was possible only because the data archives already existed. The data archives already existed because they were not, primarily, collected for science. Technologies for marine magnetics and aeromagnetic surveying were invented for anti-submarine warfare, and the early data were collected for this purpose. The geological mapping of western North America was sponsored by the U.S. government for economic reasons, to assure "mineral resources to the future development and prosperity of the Nation."

The authors of the Fourth Paradigm book consider that the capability to do important basic science by exploring data archives is a consequence of recent advances in computer and sensor technology. I would suggest, though, that this paradigm comes into play whenever the rate of accumulation of data greatly outstrips the rate of accumulation of interpretation, i.e. the rate at which the scientific community can assimilate data into an interpretive framework. This imbalance can happen when the cost of accumulating data drops. But it can also happen when there are non-science drivers spurring the accumulation of data. Relative to some other disciplines, Earth Scientists are privileged to function in a data-rich environment, tapping into data troves that were collected for mineral and energy exploration, weather forecasting, agriculture, natural hazards mitigation, nuclear test ban verification, water and land management, and military purposes.

Stating that data-first science is not exactly new does not take away from its importance. On the contrary, my assertion that some of the most important insights of the plate tectonic revolution were Fourth Paradigm science actually reinforces the message that this mode of doing science can be immensely productive.

References & Sources:

Hey, T., Tansley, S., & Tolle, K. (Eds.). (2010). The Fourth Paradigm: Data-Intensive Scientific Discovery: Microsoft Research, available for download here.

Review of the book in Science magazine: Collins, J. P. (2010). Sailing on an ocean of 0's and 1's. Science, 327, 1455-1456.

Pitman III, W. C., & Heirtzler, J. R. (1966). Magnetic Anomalies over the Pacific-Antarctic Ridge. Science, 154, 1164-1171.

Atwater, T. (1970). Implications of plate tectonics for the Cenozoic tectonic evolution of Western North America. Geological Society of America Bulletin, 81(12), 3513-3536.

My first thinking about the fact that geoscientists function in a data-rich environment, and that this shapes our scientific approach, was done in response to a request to talk at the National Geographic Society 17 June 2012 on the topic of "Skills, Practices & Ways of Thinking (SPWOT's) in Geosciences," which in turn built on thinking done during the Synthesis of Research on Thinking & Learning in the Geosciences.

Walter Pitman spoke about the early days of plate tectonics on 11 September 2012 as a guest in an interdisciplinary seminar on "Making Meaning from Data" co-taught by Tim Shipley and myself as part of our work in the NSF program "Fostering Interdisciplinary Research in Education" grant number DRL 11-38616. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation

Is the Fourth Pardigm Really New? -- Discussion  

Hi Kim, A very interesting essay! I think that the examples you've shown demonstrate the importance of geoscience as an observational, interpretive and historical science (as Bob Frodeman asserts, 1995). Our science is not necessarily always "hypothesis driven"--sometimes it's just a good idea to get "out there" and take a look around. Exploration, inquiry, and discovery are essential drivers of our science--whether walking across a new field area just looking around to see what's interesting or important, or putting a sample on the SEM to see what might be lurking beyond our immediate ability to see, there are strange and wonderful things to be discovered. Part of the equation is for the observer to have a "fertile mind" and be able to make connections across many lines of evidence. And, it also helps to have a wide experience to be able to correlate and compare different parts of the Earth system. A big fear that I have is that a lot of the geological evidence lies in field note books, drafts of maps which are hard to publish these days. Submission guidelines largely require that a lot of the foundational information is edited out of the articles that get published: key outcrops, best sampling sites, etc. so this information may well be lost. And, another big danger is the loss of research collections once researchers retire. When I started collecting my Archean rocks in Montana, Rb-Sr was the radiometric dating method of choice. But over the years, as science and technology have advanced, these same samples have been looked at again and again for U-Pb zircon (originally dissolving aggregates of whole grains, but now using SHRIMP or LA-ICPMS), Nd/Sm, Lu-Hf, Ar-Ar (on biotite, muscovite, hornblende), U-Pb on other accessory minerals such as titanite, monazite, and now U-Th/He dating on apatites--all potentially from the same sample! So, as great as the immense digital databases may be (seismic, climate, etc.), we have a lot of data to "mine" from basic field observations, maps, and rock samples that you actually hold in your hand.


Share edittextuser=7 post_id=21759 initial_post_id=0 thread_id=6292

GeoSciences - especially Petroleum GeoScience had intensely used "Data" and "Analysis" since 1970s. The early work of Harbaugh J.W. and Merriam D.F (Computer Application in Stratigraphic Analysis, 1968), J.C.Davis (Statistics and Data Analysis in Geology, 1971) and W.Schwarzacher (1975) all have introduced the foundations. These works even made all the methods publicly accessible - the spirit of 4th Paradigm Science.

The Data Intensive Scientific Discovery (DISD) or 4th Paradigm Science is not just about using data, analysis, integration and discovery of new principles. It is about the way the 4 pillars are recognized, organized and scaled to become a "crowd sourced" scientific process. Classic example is 'International Nucleotide Sequence Database Collaboration' in Proteomics and Genomics.

4 Pillars of DISD (4th Paradigm Science) are:
1) Data base and Data Life Cycle Management - universally by all data creators and contributors;
2) Scientific Workflow like Taverna, MyExperiment or Microsoft Azure - used across by scientists for consistent, correct and competent methodology. This ensures repeatability.;
3) Analysis methods like R, SciPy - where a continuous set of improved analytics and visualization is provided for entire community and
4) Presentation and Exchange - like arXiv or pubmed - where most if not all scientific results are publicly accessible.

GeoSciences especially Petroleum GeoScience (well funded) is far from following these 4 pillars. The early advantage from pioneers in 1970s had been eroded and lost to the science. Even such most widely used data-set like "Geological Time Scale" is a 'closed' compilation activity with its source data, methodology, workflow and results remaining "not open".

It is not about when we started in the right path of science - it is about how we are staying in the right path now.
It is not about what is accomplished with data - it is about how it is accomplished that determines 4th Paradigm Science.


Share edittextuser=38291 post_id=30170 initial_post_id=0 thread_id=6292

Join the Discussion

Log in to reply

« Curriculum by Design       Buddy Can You Spare...Support for a Graduate Student? »