Prologue
Overview
Earth and planetary scientists, like all scientists, routinely engage in fundamental activities such as data fitting, regression analysis, and uncertainty estimation. These core tasks are essential for interpreting observations and drawing scientific conclusions, yet many learners and early-career researchers lack a clear, accessible introduction to data science methods. While these activities are common across scientific disciplines, the tools and best practices for performing them efficiently and reproducibly are not always obvious.
Traditional academic teaching often takes a bottom-up approach, leading students through extensive theory, mathematical derivations, and introducing intimidating terminology that can be opaque to newcomers. Anecdotally, this challenge is especially pronounced in statistics courses, which often discourage newcomers who simply want to apply methods correctly but feel overwhelmed by the prospect of mastering the entire field of statistics. Many learners are motivated by practical needs and do not have the time or desire to become statisticians—they just want to use the tools effectively and responsibly.
Hence, this course takes a different path: we focus first on demonstrating the immediate usefulness of a method or tool by applying it directly to a problem you are likely to encounter. While this means we sometimes delay detailed explanations, the top-down philosophy is intended to show the value of the material right away and help you build intuition for data analysis—without being overwhelmed by theoretical foundations at the outset. We prioritize highlighting common pitfalls and offering practical guidance, rather than exhaustive derivations and lengthy theoretical discussions.
Hence this course is built with a clear teaching mission: to introduce key data science tools and concepts in the context of Earth and planetary sciences, providing a hands-on learning experience. Working directly with data through interactive notebooks allows learners to experiment, visualize, and immediately see the effects of their analyses. This hands-on approach deepens understanding, builds intuition, and fosters critical thinking, transforming abstract concepts into practical skills that can be applied confidently to real-world Earth and planetary science problems.
By following the course and using the accompanying Python tools, learners will:
Develop a solid foundation in general data science techniques such as regression, classification, and data visualization
Apply these methods directly to Earth and planetary datasets and problems
Build confidence in using Python and scientific computing libraries for their own projects
Gain skills that are broadly transferable across many scientific and engineering disciplines
Instructional materials
All instructional materials, including lectures, exercises, and examples, are organized within the notebooks directory. This folder contains a sequence of Jupyter notebooks designed to guide learners step-by-step through the course content. Each notebook is self-contained and interactive, allowing users to run code, visualize results, and explore concepts hands-on.
Python
This course assumes familiarity with Python programming, inasmuch as you care to repurpose the code snippets for your own projects. However, you can also ignore the coding aspects and instead read the narrative to learn the data science aspects.
. However, to keep the learning curve manageable and focus on core data science concepts, the course relies primarily on widely used, standard Python packages such as NumPy, Pandas, Matplotlib, and SciPy rather than custom implementations. This approach ensures that learners gain practical skills with tools that are common in both academic and industry environments, making it easier to apply what they learn to their own Earth and planetary science projects.