Spatially explicit surface water quality analysis

Spatially explicit surface water quality analysis in river networks: linking public water quality data to watersheds and network flowlines


This data expeditions module used three full course sessions to introduce undergraduate hydrology students with minimal programming background to:

  • Public water data (water quantity and chemistry)
  • Spatial analysis of water data
  • 2 core, spatial datasets produced by the USGS that enable spatial analysis
  • The programming language R
  • R based tools for water data
  • Spatial analysis and maps in R

Faculty sponsor: Dr. Kateri Salk (,Nicholas School
Graduate Student: Nicholas Bruns (, 3rd year PhD student, river ecology
Course: EOS.323, Landscape Hydrology for undergraduates (instructor: Salk)


Water science now has an extensive, standardized ecosystem for accessing and analyzing water data. The USGS has led this initiative through 4 actions:

  1. Publishing its own high quality public data,
  2. Joining data from other federal, state, local, tribal, and private sector entities, publishing “harmonized” public datasets casted into USGS standards
  3. Developing R interfaces for public data
  4. Developing other R tools for working with its published data

Our module aimed to introduce students to this ecosystem. We also emphasized spatial analysis because many core problems in water science require a spatial perspective. For example, managers aiming to improve water quality in a lake may aim to reduce the nutrients in the river entering that lake. These nutrients entered the waterway somewhere in the upstream watershed, for instance from an outdated septic system. Reducing nutrients entering the lake therefore, replacing that outdated septic system, which in turn requires locating its specific location.

Sessions 1 and 2 worked through RMarkdown files. Rather than using virtual machines, all students installed and developed an R and data environment on their own machine. Course sessions began with overview lecture, proceeded with walking through and running code chunks together, and finished with activities to assess comprehension. We achieved our comprehension assessment by giving students questions that required modification and running of existing code. We found that this “template based”, or “cut and paste” based approach was successful, especially on our second session, as it allowed students with minimal previous coding experience to gain exposure to the potential of programming, public data, and programming based spatial analysis. Our intended approach, however, was designed for, and indeed required, in-person circulation by the instructors to trouble-shoot installation differences. Therefore, when our third planned session occurred in the surprise remote conditions of spring 2020, we shifted from using R as planned to using a pre-built tool for interacting with water chemistry data. Specifically, we used a wonderful visualization dashboard (that actually began as an IID, Data+ project):

Our final sessions emphasized the “so what”—what would we want to do with data?