1 Introduction

Project Goal: To compare the preliminary SWOT L2_HR_Raster product with a widely adopted surface water map (Global Surface Water Occurrence – GSWO)

Background: The Surface Water and Ocean Topgraphy (SWOT) satellite launched in December of 2022, and will vastly enrich scientific knowledge of Earth’s water cycle. The Ka-band Radar Interferometer (KaRIn) on board sends radar pulses to Earth, and collects those pulses with two antennae. As the radar pulses return they reach each antenna with a different phase, because they traveled different distances (path lengths). Using the principles of physics and geometry underlying electromagnetic waves, NASA’s algorithms can measure the elevation and presence of surface water. SWOT will measure global inland surface water with high vertical (10cm) and horizontal (15cm) accuracy twice (on average) every 21 days.

Here’s a fantastic animation NASA made showing SWOT in action! You can see the orbital path, measurement footprint and radar pulses. (https://swot.jpl.nasa.gov/resources/142/swot-global-coverage/)

The vast majority of SWOT data is still not publicly available; however, NASA released a small preliminary beta version for scientists to familiarize themselves with the data set’s limitations and opportunities. The beta release is also a chance for users to build workflows and learn about the complex file structures. In this analysis, I’ll explore a preliminary scene from SWOT and compare it to a widely adopted surface water map (GSWO).

Area of Study: I selected a SWOT image from Oregon, which contains the Columbia River, the area around Portland/Vancouver and peaks Cascades peaks like Mt. Hood and Mt. Adams. Our SWOT image was taken in April 25th of 2023.

2 Overview of Steps/Workflow:

  1. Read the SWOT data in its ncdf format.

  2. Read the Global Surface Water Occurance dataset (GSWO). Clip, project, and resample GSWO to match SWOT’s extent, projection and resolution.

  3. Compare the SWOT and GSWO data sets.

  4. Modify the SWOT data set to better match the GSWO data.

3 Set-up libraries & read data

3.1 Set up libraries & directories

library(ncdf4)
library(RColorBrewer)
library(ggplot2)
library(dplyr)
library(terra)
library(raster)
library(leaflet)
library(sp)

setwd('/Users/jmaze/Documents/projects/GEOG590-SWOTR')

data_raw_path <- './data_raw/'

3.2 Read the SWOT data

# The SWOT path names contain rich metadata such as observation time, cycle #, UTM zone, etc. 
# After downloading, I changed the file name for simplicity. 
swot_path <- paste0(data_raw_path, 'SWOT1.nc')
swot_nc <- nc_open(swot_path)

# Extracts the horizontal and vertical extents
x <- ncvar_get(swot_nc, 'x')
y <- ncvar_get(swot_nc, 'y')

# Extracts the water fraction variable
wtr_area_frac_array <- ncvar_get(swot_nc, 'water_area')
# Extracts the water surface elevation variable
wse_array <- ncvar_get(swot_nc, 'wse')

# This function gets ncdf file's meta data, which is handy in later steps. 
swot_nc_atts <- ncatt_get(swot_nc, 0)
min_x <- swot_nc_atts$geospatial_lon_min
min_y <- swot_nc_atts$geospatial_lat_min
max_x <- swot_nc_atts$geospatial_lon_max
max_y <-swot_nc_atts$geospatial_lat_max

3.3 Read the global surface water occurrence dataset

gsw_occurance_path <- paste0(data_raw_path, "occurance.tif")
gsw_occurance <- rast(gsw_occurance_path)

4 Clip, Mask, Resample and Reproject the GSWO data to match the SWOT data

4.1 Select the GSWO data within SWOT’s bounding box

min_x <- swot_nc_atts$geospatial_lon_min
min_y <- swot_nc_atts$geospatial_lat_min
max_x <- swot_nc_atts$geospatial_lon_max
max_y <-swot_nc_atts$geospatial_lat_max

bbox <- c(xmin = min_x, xmax = max_x, ymin = min_y, ymax = max_y)

# Create a spatial extent object from the bounding box

bbox_extent <- extent(bbox)


# Clip the raster using the bounding box
gsw_occurance_bbox <- crop(gsw_occurance, bbox_extent)

# Clean up memory
#rm(gsw_occurance)

4.2 Mask GSWO using SWOT’s tile

# Let's clip the GSWO data to the SWOT image's footprint
# The SWOT data is projected to UTM10N; however, these bounds are in WSG:84 coordinates.
# Clipping the GSWO data before projecting, will reduce the computational load. 
swot_lf_lon <- swot_nc_atts$left_first_longitude
swot_lf_lat <- swot_nc_atts$left_first_latitude
swot_ll_lon <- swot_nc_atts$left_last_longitude
swot_ll_lat <- swot_nc_atts$left_last_latitude
swot_rf_lon <- swot_nc_atts$right_first_longitude
swot_rf_lat <- swot_nc_atts$right_first_latitude
swot_rl_lon <- swot_nc_atts$right_last_longitude
swot_rl_lat <- swot_nc_atts$right_last_latitude

coords <- matrix(c(swot_lf_lon, swot_lf_lat,  # left_first
                   swot_ll_lon, swot_ll_lat,  # left_last
                   swot_rl_lon, swot_rl_lat,  # right_last
                   swot_rf_lon, swot_rf_lat,  # right_first
                   swot_lf_lon, swot_lf_lat), # closing the polygon by repeating the first point
                 byrow = TRUE, ncol = 2)

poly <- vect(coords, type = "polygons")
crs(poly) <- "EPSG:4326"

gsw_occurance_tiled <- mask(gsw_occurance_bbox, poly, inverse = FALSE)

4.3 Reproject the GSWO data to match SWOT’s projection. Resample the GSWO bringing it from 30m resolution to SWOT’s 100m resolution.

swot_spatrast <- rast(swot_path)
## Warning: [rast] CRS do not match
crs_swot <- crs(swot_spatrast)

gsw_occurance_reproj <- project(gsw_occurance_tiled, crs_swot)
gsw_occurance_resampled <- resample(gsw_occurance_reproj, swot_spatrast, method = "bilinear")

#rm(gsw_occurance_tiled, gsw_occurance_reproj)

4.4 Does SWOT agree with the GSWO mask?

brown_to_blue <- colorRampPalette(c("sienna3", "lightblue4", "blue"))(100)

image(gsw_occurance_resampled, col = brown_to_blue, main = 'GSW occurance map')

image(wtr_area_frac_array, col = 'lightblue4', main = 'SWOT water fraction map')

The raw SWOT data is pretty noisy, and needs lots of cleaning to match the GSWO data! There’s lots of speckling, a large part of this problem is our choice of color scale. Also, there’s an bizarre line running diagonal through the SWOT image. This is because SWOT cannot measure at NADIR (directly under the Satellite path).

5 Modify the SWOT to fit the GSWO mask

5.1 What are the distribution of values for the wtr_frac_pixels?

In order to better visualize the SWOT data, we need to match our color scale with the pixel values. Let’s inspect the values for water_frac_pixels.

wtr_vector <- na.omit(as.vector(wtr_area_frac_array))
wtr_frac_df <- data.frame(value = wtr_vector)

# Quick histogram for the SWOT
pixel_histogram <- ggplot(data = wtr_frac_df,
                          mapping = aes(x = value)) +
  geom_histogram(bins = 500) +
  xlim(-10, 100000) +
  geom_vline(xintercept = 10000, color = 'red', linewidth = 1, linetype = "dashed") +
  theme_bw() +
  labs(title = "Distribution of SWOT water area pixels",
       y = "Count (log scale)",
       x = "Water pixel value (square meters)") +
  scale_y_log10()

(pixel_histogram)
## Warning: Removed 17217 rows containing non-finite values (`stat_bin()`).
## Warning: Transformation introduced infinite values in continuous y-axis
## Warning: Removed 31 rows containing missing values (`geom_bar()`).

The maximum pixel values should not exceed 10,000 square meters (100m x 100m = 10,000 square meters), and they should also never be less than zero (i.e. negative water fraction). The red line on this plot marks the 10,000 square meters cutoff. According to SWOT’s documentation, erroneous pixels values are included for users to process at their discretion. NASA’s scientists are prioritizing data flexibility over data fidelity.

5.2 A crude fix for erroneous pixels

Let’s try eliminating the pixel values above 10,000 and below 100, by re-designating them as 10,000 and 0 respectively. This handles unrealistic values and should mitigate our speckling problem. By choosing a threshold of 100 (> 1% water area), the image should look less noisy.

# Make a copy of the swot_spatrast
swot_spatrast_copy <- swot_spatrast
# Apply the conditional replacements
swot_spatrast_copy[["water_area"]][swot_spatrast_copy[["water_area"]] < 100] <- 0
swot_spatrast_copy[["water_area"]][swot_spatrast_copy[["water_area"]] > 10000] <- 10000

#Convert this specific layer to a RasterLayer for Leaflet:
swot_rasterlayer_water_area <- raster(swot_spatrast_copy[["water_area"]])


image(swot_rasterlayer_water_area, col = brown_to_blue, main = 'SWOT water fraction map V2')

image(gsw_occurance_resampled, col = brown_to_blue, main = 'GSW occurance map')

This looks more realistic! You can see the pixels with the higher fraction of water area colored blue. Pixels with intermediate water fraction (>1%) are brown in the SWOT image. In GSWO, the land surface is brown and does not denote surface water.

5.3 Check out the region of interest using leaflet and open street map.

leaflet() %>%
  addTiles() %>%  # This adds the default OpenStreetMap basemap
  addRasterImage(swot_rasterlayer_water_area, colors = brown_to_blue, opacity = 0.7) %>%
  addLayersControl(overlayGroups = c("Raster Layer"), options = layersControlOptions(collapsed = FALSE))

6 Conclusion

Overall, there’s still considerable speckling and noise in the SWOT image, but we produced something realistic in a few short steps. We should contemplate why the SWOT image appears to have so much more water compared to OpenStreetMap (OSM) and GSW. Context is key! Think about the timeframes involved; GSWO and OSM are drawn from long-term averages. Meanwhile, SWOT images a single day in April, a wet month for the Pacific Northwest. I checked Portland’s weather on Apr. 25th 2023. Though it didn’t rain, you could expect considerable water on the landscape. With satellite data, we should recognize there’s never a “perfect” data set, and substantial processing decisions impact the final product. In my Master’s, I’m encounter processing challenges like this daily!