Meterological Lake Comparison
  • Home
  • Overview
  • Methods
  • Time Series
  • Metrics & stats
  • Seasonal and event behaviour
  • Rotorua demo
  • Reproducibility
  • References

Table of Content

  • Setup (once per machine)
  • Changes for a new lake: scripts
  • Changes for a new lake: QMD + YAML
  • Step-by-step: new lake run
  • Example: Lake Tutera (template)
  • Smaller analysis: two datasets only (ERA5 + VCS_On)
  • Smaller analysis: one variable only

Reproducibility / How to reuse

Use this guide to rerun the Rotorua workflow or adapt it to a new lake. The project flow is:

  1. Prepare raw data into daily, standardized CSVs (scripts/02_prepare_raw_data.R).
  2. Load files into (scripts/03_analysis_helpers.R, and scripts/04)analysis_plotting.R).
  3. Load source scripts to QMDs, compute metrics, and render plots.
  4. Render the site with Quarto.

Additionally, just use scripts/05_analysis_rotorua_plots.R to print plots without needing to render qmds.

This page separates required changes to scripts (data prep and helper code) from changes to QMDs/YAML (report configuration).

Setup (once per machine)

  1. Install R (>= 4.4) and Quarto.
  2. From the project root, run scripts/00_project_setup.R to restore renv and required GitHub packages.
  3. If renv::restore() fails (offline), set a reachable CRAN mirror with options(repos = ...) and try again.

Changes for a new lake: scripts

Make these edits in scripts if you want to re-use the pipeline for another lake. These changes affect data creation and helper defaults.

scripts/02_prepare_raw_data.R

  • Update the raw file paths under data/raw/ to your new lake folders and filenames.
  • Update station IDs in file paths and any column name mappings that differ.
  • Keep the UTC conversion and daily aggregation rules, using means for state variables (Temp_C, Wind_Spd_ms, RadSWD_Wm2) and sums for flux variables (Precip_mm).
  • Keep or update unit conversions, including wind knots to m/s if needed, sub-daily precip rates to daily totals, and radiation MJ/m^2 to W/m^2 when required.
  • Update any bad-data filtering to match your lake (e.g., remove known corrupt periods).
  • Update the output filenames at the end so they match your new lake ID (for example: data/processed/<lake_id>_era5_daily.csv, data/processed/<lake_id>_vcs_on_daily.csv, plus station/buoy files as applicable).

scripts/03_analysis_helpers.R

  • This file currently reads Rotorua processed data when sourced.
  • If you are not keeping Rotorua files, update the file paths to your new lake or remove/guard those read lines so sourcing does not error.
  • The metrics functions (metrics_vs_ref) are reusable and do not need changes unless you change variable names.

scripts/04_analysis_plotting.R

  • This file also reads Rotorua processed data when sourced.
  • If you are not keeping Rotorua files, update those read paths or guard them.
  • Update ref_df, targets_list, and target_colors defaults if your reference/targets have different names.
  • Plot functions are reusable. They assume column names Date, Temp_C, Precip_mm, Wind_Spd_ms, RadSWD_Wm2.

scripts/05_analysis_rotorua_plots.R

  • Rotorua specific example. Update only if you want to run a stand alone script for a new lake. The QMDs do not depend on this file.

Changes for a new lake: QMD + YAML

These edits control the rendered report and are required even if you only update the scripts.

params.yml (optional single source of truth)

  • This file contains params: used by QMDs if you remove local params blocks.
  • If you keep local params inside each QMD (current setup), update them in each QMD instead.

QMD front

Update these in each of: index.qmd, Overview.qmd, Results.qmd, metrics_stats.qmd, Behaviour.qmd, rotorua.qmd:

  • lake_name, lat, lon, buffer_km, lake_id
  • reference and targets
  • vars and thresholds (wet_threshold_mm, precip_event_threshold, windy_top_pct, windy_threshold_ms, window_days)
  • In Overview.qmd, update any station/point coordinates and labels used for the map.

QMD file_map blocks

Each of the QMDs listed above contains a file_map <- list(...) block. Update those paths to your new processed files.

QMD plot calls

Some plots are hard-coded to specific variables (Temp, Wind, Precip). If you remove variables or change the reference name:

  • Wrap plot calls with checks like if ("Temp_C" %in% vars) ... to avoid errors.
  • Pass ref_name = params$reference into plot functions if your reference is not Airport_1770.

**_quarto.yml**

  • Add or remove QMD files in the project.render list if you change page names.
  • The site outputs to docs/ by default.

Step-by-step: new lake run

  1. Place raw files under data/raw/<your_lake>/....
  2. Edit scripts/02_prepare_raw_data.R for your new lake and run it to create data/processed/<lake_id>_*.csv.
  3. Update the QMD params and file_map blocks to point at your new processed files.
  4. Render the site: quarto render.

Example: Lake Tutera (template)

Use this as a starting point for naming and params:

lake_name: "Lake Tutera"
lat: -39.134088
lon: 176.552553
buffer_km: 10
lake_id: "tutera"
reference: "Airport_XXXX"
targets: ["ERA5", "VCS_On"]
vars: ["Temp_C", "Precip_mm", "Wind_Spd_ms", "RadSWD_Wm2"]
wet_threshold_mm: 1
precip_event_threshold: 1
windy_top_pct: 0.10
windy_threshold_ms: 10
window_days: 30

Example processed filenames:

  • data/processed/tutera_era5_daily.csv
  • data/processed/tutera_vcs_on_daily.csv
  • data/processed/tutera_airport_XXXX_daily.csv

Smaller analysis: two datasets only (ERA5 + VCS_On)

  1. Set reference to the dataset you want as the benchmark (e.g., VCS_On).
  2. Set targets: ["ERA5"].
  3. In each QMD file_map, keep only those two files (plus the reference file if it is separate).
  4. Update any plot calls or titles that still say Airport_1770 to use params$reference.

This will produce the same metrics, plots, and diagnostics, but only for the chosen pair.

Smaller analysis: one variable only

  1. Set vars to a single variable (e.g., ["Wind_Spd_ms"]).
  2. Ensure that variable exists in every processed dataset.
  3. In Results/Behaviour/Metrics QMDs, remove or guard plot chunks that reference other variables.

This keeps the workflow identical but limits output to one variable.