--- title: "Complete survey workflow: Example Bay" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Complete survey workflow: Example Bay} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 4, eval = FALSE ) ``` ## Overview This vignette walks through a complete oystermapR workflow using a simulated survey of **Example Bay** — a fictional sheltered coastal inlet with realistic physical characteristics. The full pipeline is covered: 1. Loading acoustic Doppler (ADCP) current data 2. Loading bathymetric soundings 3. Loading CTD water-quality data (temperature, salinity, chlorophyll_a) 4. Merging all sensor datasets onto a common grid 5. Running quality control 6. Predicting suitability for *Ostrea edulis* (European Flat Oyster) 7. Scoring wave exposure and HAB risk 8. Exporting a GeoTIFF for QGIS The example data files (`example_bay_adcp.csv` and `example_bay_soundings.xyz`) are included in the package and represent a simulated mid-summer survey. ```{r library} library(oystermapR) ``` --- ## Survey context Example Bay is a sheltered inlet approximately 6 km × 4 km in extent, with depths ranging from 2 m at the shoreline to around 22 m in the central channel. A six-hour ADCP transect was run on a flood tide during June. Bathymetric soundings were collected concurrently using a single-beam echosounder. The target species is *Ostrea edulis*, the European Flat Oyster, whose optimal conditions include: depths of 1–20 m, current speeds of 0.05–0.4 m/s, temperatures of 5–25°C, and salinities of 25–35 PSU. --- ## Step 1: Load ADCP data `read_nortek_adcp()` reads the Nortek Signature 500 merged CSV export. It auto-detects velocity bins, spatially averages ensembles onto a grid, and derives bed shear stress from the near-bed velocity profile. The `spatial_res` argument controls the decimal places used for lat/lon grid binning; `2` gives approximately 1 km cells, suitable for a bay-scale survey. ```{r load-adcp} adcp_file <- system.file("extdata", "example_bay_adcp.csv", package = "oystermapR") adcp <- read_nortek_adcp( file = adcp_file, spatial_res = 2, verbose = TRUE ) head(adcp[, c("lat", "lon", "current_velocity", "shear_stress")]) ``` The output contains one row per grid cell. `current_velocity` is the mean near-bed flow magnitude; `shear_stress` is the estimated bed shear stress in N/m², used by `predict_oyster()` for suspension-feeder scoring. --- ## Step 2: Load bathymetric soundings `read_soundings_xyz()` reads a space-delimited XYZ point cloud, grids and averages the depths, and derives slope and rugosity via finite differences. ```{r load-bathy} xyz_file <- system.file("extdata", "example_bay_soundings.xyz", package = "oystermapR") bathy <- read_soundings_xyz( file = xyz_file, spatial_res = 2, min_soundings = 5, verbose = TRUE ) head(bathy[, c("lat", "lon", "depth", "slope", "roughness")]) ``` `slope` is the maximum downslope gradient in degrees at each grid cell; values above ~15° typically exclude oyster settlement in the scoring model. `roughness` is a dimensionless rugosity index (1.0 = flat; higher = more complex substrate). --- ## Step 3: Load CTD water-quality data The ADCP and echosounder provide hydrodynamic and bathymetric variables, but `predict_oyster()` can also use water-quality data when available. The Example Bay survey included a 5 × 4 grid of CTD casts recording temperature, salinity, and chlorophyll_a, stored as a plain CSV. `read_generic_csv()` handles any tabular sensor export with flexible column matching. Column names are recognised automatically when they follow standard conventions (`lat`/`lon`, `temperature`, `salinity`, `chlorophyll_a`). For instruments that export non-standard headers, supply an explicit `col_map`: ```r # Example with non-standard headers from a YSI EXO sonde export: ctd <- read_generic_csv( "ysi_export.csv", col_map = c(lat = "GPS_Lat", lon = "GPS_Lon", temperature = "Temp_C", salinity = "Sal_PSU") ) ``` For Example Bay the column names match the oystermapR conventions directly: ```{r load-ctd} ctd_file <- system.file("extdata", "example_bay_ctd.csv", package = "oystermapR") ctd <- read_generic_csv( file = ctd_file, spatial_res = 2, verbose = TRUE ) head(ctd[, c("lat", "lon", "temperature", "salinity", "chlorophyll_a")]) ``` Spatially averaging at `spatial_res = 2` collapses the four replicate casts per station to a single value per ~1 km grid cell. --- ## Step 4: Merge all sensor datasets `merge_sensor_data()` performs a full outer join of any number of sensor dataframes on rounded lat/lon grid keys. Cells present in only one source are retained with `NA` for the missing variables, so no data is discarded. All three layers — ADCP, bathymetry, and CTD — are combined in a single call: ```{r merge} survey <- merge_sensor_data(adcp = adcp, bathy = bathy, ctd = ctd) cat("Merged survey:", nrow(survey), "grid cells\n") cat("Columns:", paste(names(survey), collapse = ", "), "\n") ``` A `substrate_hardness` column would typically come from a sidescan mosaic processed via `read_sonar_tif()`. Here we add a representative value for a mixed shell-gravel substrate, typical of productive inshore oyster ground. ```{r add-substrate} set.seed(101) n <- nrow(survey) # 0 = soft mud, 1 = hard rock; 0.3--0.7 = shell/gravel mix survey$substrate_hardness <- round(runif(n, 0.30, 0.70), 2) ``` --- ## Step 5: Quality control `qc_survey_data()` applies three complementary checks to every numeric column: - **Range check** — values outside physically plausible bounds for that variable (species-agnostic; e.g. salinity above 42 PSU) are flagged `"range_fail"`. - **Statistical outlier** — values beyond `iqr_k` × IQR from the median are flagged `"outlier"` (default k = 3, Tukey's outer fence). - **Cross-variable sanity** — physically implausible combinations (e.g. dissolved oxygen above 20 mg/L with temperature below 5°C) are flagged. Setting `apply_flags = TRUE` replaces flagged values with `NA` so they are silently skipped downstream rather than causing erroneous scores. ```{r qc} survey_clean <- qc_survey_data( df = survey, apply_flags = TRUE, verbose = TRUE ) # Count any flags raised across all columns flag_cols <- grep("^qc_flag_", names(survey_clean), value = TRUE) n_flagged <- sum(sapply(survey_clean[flag_cols], function(x) sum(!is.na(x) & x != "pass"))) cat("Total flagged values replaced with NA:", n_flagged, "\n") ``` The QC step is non-destructive by default (`apply_flags = FALSE`) — it adds `qc_flag_` columns so you can inspect which cells were problematic before deciding whether to exclude them. --- ## Step 6: Predict suitability `predict_oyster()` applies AHP-weighted scoring rules to each available variable, combines them into a suitability score in [0, 1], and classifies locations as High / Moderate / Low / Very Low / Excluded. ```{r predict, warning = FALSE} result <- predict_oyster( data = survey_clean, species = "ostrea_edulis", verbose = TRUE ) # Summary of suitability classes table(result$suitability_class) ``` ```{r suitability-summary} # Mean score and range cat(sprintf( "Suitability: mean = %.2f, range = %.2f -- %.2f\n", mean(result$suitability, na.rm = TRUE), min(result$suitability, na.rm = TRUE), max(result$suitability, na.rm = TRUE) )) ``` The result dataframe retains all input columns plus `suitability`, `suitability_class`, and per-variable component scores (`score_depth`, `score_current_velocity`, etc.). --- ## Step 7: Risk and disturbance scoring oystermapR includes optional risk modules that can be appended to the result. Here we add wave exposure (derived from the current speed data and fetch geometry) and a simple HAB risk score. ```{r risk, warning = FALSE} # Wave exposure: uses current_velocity and depth as proxies for fetch exposure result <- score_wave_exposure(result, verbose = FALSE) # HAB risk: without live ICES data, scores from chlorophyll_a alone result <- score_hab_risk(result, verbose = FALSE) cat("Wave exposure range:", round(range(result$wave_exposure, na.rm=TRUE), 3), "\n") cat("HAB risk range: ", round(range(result$hab_risk, na.rm=TRUE), 3), "\n") ``` --- ## Step 8: Export GeoTIFF for QGIS `export_geotiff()` interpolates the suitability scores onto a regular raster and writes a GeoTIFF. `export_qml_style()` writes a matching QGIS colour-ramp style file (`.qml`) so the layer renders immediately with the standard oystermapR colour scheme (red = excluded, green = high suitability). ```{r export, eval = FALSE} # Write GeoTIFF and companion QGIS style file export_geotiff( df = result, path = "example_bay_suitability.tif", resolution = 0.001, contours = TRUE ) export_qml_style("example_bay_suitability.tif") ``` Load `example_bay_suitability.tif` into QGIS via **Layer → Add Layer → Add Raster Layer**, then right-click the layer and choose **Load Layer Style** to apply the `.qml` file. --- ## Inspecting component scores The result dataframe includes a `score_` column for every variable that was scored. Comparing these helps diagnose which environmental factor is the main limiting constraint at a site. ```{r component-scores} score_cols <- grep("^score_", names(result), value = TRUE) # Mean component score per variable (higher = more suitable) col_means <- sort(colMeans(result[score_cols], na.rm = TRUE)) print(round(col_means, 3)) ``` Variables scoring consistently below 0.5 are the main limiting factors for *Ostrea edulis* at this site. Scores near 1.0 indicate that variable is not constraining growth. --- ## Next steps - **Validation** — if you have historical presence/absence records for the site, pass them to `validate_against_records()` to calculate AUC, TSS, and F1 against model predictions. - **Bayesian tolerance updating** — `update_species_tolerances()` can refine the *O. edulis* scoring parameters from field observations at this site. - **Multi-species comparison** — run the same workflow for `"magallana_gigas"` and call `compare_species()` to see which species is better suited to each grid cell. - **Live data** — `fetch_live_environmental_data()` can replace the simulated temperature and salinity with real-time CMEMS model output or ICES observational data; see `?oystermapR_live_config` for credential setup. --- ## Session info ```{r session-info} sessionInfo() ```