Initial Publication Date: August 16, 2018

"The Gentle Art of Data Science:" Increasing comfort with computation

Andrew Fischer, Institute for Marine and Antarctic Studies, University of Tasmania

The extraction and interpretation of information from vast quantities of data is now a fundamental part of advanced inquiry across a variety of disciplines from particle physics digital humanities, environmental monitoring and health and finance. Finding new ways to mine, manipulate, and interpret the data in all these disciplines will become more important as students are inundated with an overwhelming, excess of data and knowledge. Developing concepts and skills in computation and computation thinking will be important for students to achieve insights and new knowledge from these data sets and to inspire innovation and change around the world. One of the University of Tasmania's strategic research directions is the theme of Data, Knowledge and Decisions. This research theme emphasizes that students should become proficient at "collecting and analyzing huge data sets to build new models to improve community, environmental and economic outcomes, and provide a better foundation for policy and business practice." Computational skills play a crucial role in achieving this and it is therefore critical for students to continue their computation skills development from primary years, through high school and university across all disciplines. Despite this strategic research theme, there is a lack of coordinated effort in teaching these concepts and a lack of coordination in software instruction. Students are offered instruction in a variety of software tools (MATLAB, R and Python) based on discipline specific problems and individual expertise of the instructor. This causes a lot of anxiety and confusion among the students, as they are often frustrated when having to learn the peculiarities of new software and as a result miss discipline specific content. As a result, the University is developing a class, "The Gentle Art of Data Science," to promote concepts of computation thinking (CT) and lay the foundation for programming and scripting skills. This class builds on the elementary concepts of computational thinking; including

  1. Decomposition - breaking down a complex problem or system into smaller, more manageable parts
  2. Pattern recognition – looking for similarities among and within problems
  3. Abstraction – focusing on the important information only, ignoring irrelevant detail
  4. Algorithms - developing a step-by-step solution to the problem, or the rules to follow to solve the problem

Content draws across a variety of disciplines at the university including, education, maritime engineering, business, archeology, history and marine science. MATLAB will be used as the primary tool and language for programming and scripting. The class does not teach programming, but will introduce concepts through MATLAB Livescript exercises. Students will develop their computational skills through a series of provided scripts and recipes. Further instruction and assessment will allow them to develop these skills. Project-based assessment will be tailored to discipline specific abilities of students and matched with a discipline expert. The aim of this class is for students to develop comfort with computation through the applied examples and slowly become comfortable with programming and computational concepts. This will allow students to be more equipped to apply computation and computation thinking in other classes. Hopefully, it will also lay a foundation of programming skills with which students can transition into other classes more easily.

Downloadable version of this essay

MATLAB Essay (Microsoft Word 2007 (.docx) 15kB Aug16 18)