--- title: "Fetching ERVISS Data" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Fetching ERVISS Data} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Introduction The `ervissexplore` package provides easy access to all data published in the [EU-ECDC Respiratory Viruses Weekly Data](https://github.com/EU-ECDC/Respiratory_viruses_weekly_data) repository. Data is returned as `data.table` objects, ready for your own analysis. ```{r setup} library(ervissexplore) ``` ## Available data sources The package supports **7 data sources** from ERVISS: | Type | Function | CSV file | Description | |------|----------|----------|-------------| | `"positivity"` | `get_sentineltests_positivity()` | `sentinelTestsDetectionsPositivity.csv` | Sentinel test positivity rates by pathogen | | `"variants"` | `get_erviss_variants()` | `variants.csv` | SARS-CoV-2 variant proportions | | `"ili_ari_rates"` | `get_ili_ari_rates()` | `ILIARIRates.csv` | ILI/ARI consultation rates by age group | | `"sari_rates"` | `get_sari_rates()` | `SARIRates.csv` | SARI rates by age group | | `"sari_positivity"` | `get_sari_positivity()` | `SARITestsDetectionsPositivity.csv` | SARI virological data (tests, detections, positivity) | | `"nonsentinel_severity"` | `get_nonsentinel_severity()` | `nonSentinelSeverity.csv` | Non-sentinel severity (deaths, hospitalisations, ICU) | | `"nonsentinel_tests"` | `get_nonsentinel_tests()` | `nonSentinelTestsDetections.csv` | Non-sentinel tests and detections | Each source has a dedicated `get_*()` function, or you can use the generic `get_erviss_data(type = ...)` function. ## Sentinel test positivity Positivity rates for respiratory pathogens from sentinel surveillance. ```{r positivity, eval=FALSE} data <- get_sentineltests_positivity( date_min = as.Date("2024-01-01"), date_max = as.Date("2024-12-31"), pathogen = "SARS-CoV-2", countries = c("France", "Germany", "Italy"), indicator = "positivity" ) head(data) ``` You can filter on multiple pathogens at once: ```{r positivity-multi, eval=FALSE} data <- get_sentineltests_positivity( date_min = as.Date("2024-01-01"), date_max = as.Date("2024-06-30"), pathogen = c("SARS-CoV-2", "Influenza", "RSV"), indicator = "detections" ) ``` ## SARS-CoV-2 variants Variant proportions and detection counts. ```{r variants, eval=FALSE} data <- get_erviss_variants( date_min = as.Date("2025-06-01"), date_max = as.Date("2025-12-31"), variant = c("XFG", "LP.8.1"), countries = c("France", "Belgium"), indicator = "detections" ) # Filter variants with a minimum proportion data <- get_erviss_variants( date_min = as.Date("2024-01-01"), date_max = as.Date("2024-12-31"), min_value = 5, indicator = "proportion" ) ``` ## ILI/ARI consultation rates ILI (Influenza-Like Illness) and ARI (Acute Respiratory Infection) consultation rates from primary care, stratified by age group. ```{r ili-ari, eval=FALSE} # Get ILI consultation rates data <- get_ili_ari_rates( date_min = as.Date("2024-01-01"), date_max = as.Date("2024-12-31"), indicator = "ILIconsultationrate", countries = "France" ) # Get both ILI and ARI rates for specific age groups data <- get_ili_ari_rates( date_min = as.Date("2024-01-01"), date_max = as.Date("2024-12-31"), age = c("0-4", "65+") ) ``` ## SARI rates SARI (Severe Acute Respiratory Infection) hospitalisation rates, stratified by age group. ```{r sari-rates, eval=FALSE} data <- get_sari_rates( date_min = as.Date("2024-01-01"), date_max = as.Date("2024-12-31"), age = c("0-4", "15-64", "65+"), countries = c("France", "Belgium") ) ``` ## SARI virological data Tests, detections, and positivity rates from SARI virological surveillance. ```{r sari-positivity, eval=FALSE} # Get positivity for Influenza data <- get_sari_positivity( date_min = as.Date("2024-01-01"), date_max = as.Date("2024-12-31"), pathogen = "Influenza", indicator = "positivity", countries = "Belgium" ) # Get detections for all pathogens data <- get_sari_positivity( date_min = as.Date("2024-01-01"), date_max = as.Date("2024-12-31"), indicator = "detections" ) ``` ## Non-sentinel severity Hospital admissions, ICU admissions, ICU inpatients, hospital inpatients, and deaths from non-sentinel sources. ```{r nonsentinel-severity, eval=FALSE} # Get hospital admissions for SARS-CoV-2 data <- get_nonsentinel_severity( date_min = as.Date("2024-01-01"), date_max = as.Date("2024-12-31"), pathogen = "SARS-CoV-2", indicator = "hospitaladmissions", countries = "France" ) # Get multiple severity indicators data <- get_nonsentinel_severity( date_min = as.Date("2024-01-01"), date_max = as.Date("2024-12-31"), pathogen = "SARS-CoV-2", indicator = c("hospitaladmissions", "ICUadmissions", "deaths") ) ``` ## Non-sentinel tests and detections Tests and detections from non-sentinel virological surveillance. ```{r nonsentinel-tests, eval=FALSE} data <- get_nonsentinel_tests( date_min = as.Date("2024-01-01"), date_max = as.Date("2024-12-31"), pathogen = "Influenza", indicator = "detections", countries = c("France", "Germany") ) ``` ## Using the generic function Instead of remembering each specific function name, you can use `get_erviss_data()` with the `type` parameter: ```{r generic, eval=FALSE} # These two calls are equivalent: data <- get_sentineltests_positivity( date_min = as.Date("2024-01-01"), date_max = as.Date("2024-12-31"), pathogen = "SARS-CoV-2" ) data <- get_erviss_data( type = "positivity", date_min = as.Date("2024-01-01"), date_max = as.Date("2024-12-31"), pathogen = "SARS-CoV-2" ) ``` This makes it easy to switch between data sources while keeping the same workflow: ```{r generic-switch, eval=FALSE} types <- c("positivity", "sari_positivity", "nonsentinel_tests") results <- lapply(types, function(t) { get_erviss_data( type = t, date_min = as.Date("2024-01-01"), date_max = as.Date("2024-12-31"), pathogen = "Influenza", countries = "Belgium" ) }) names(results) <- types ``` ## Historical snapshots All functions support fetching historical snapshots for **reproducible analyses**. The ERVISS repository stores weekly snapshots of the data, so you can retrieve the exact data that was available at a given date. ```{r snapshot, eval=FALSE} # Fetch a specific snapshot data <- get_sentineltests_positivity( date_min = as.Date("2023-01-01"), date_max = as.Date("2023-12-31"), use_snapshot = TRUE, snapshot_date = as.Date("2024-02-23") ) ``` This works with all data sources: ```{r snapshot-generic, eval=FALSE} data <- get_erviss_data( type = "nonsentinel_severity", date_min = as.Date("2023-01-01"), date_max = as.Date("2023-12-31"), pathogen = "SARS-CoV-2", indicator = "hospitaladmissions", use_snapshot = TRUE, snapshot_date = as.Date("2024-02-23") ) ``` To see available snapshot dates, visit the [EU-ECDC snapshots directory](https://github.com/EU-ECDC/Respiratory_viruses_weekly_data/tree/main/data/snapshots). ## URL helpers If you prefer to download the data files yourself, you can retrieve the URLs directly: ```{r urls} # Latest data URLs get_erviss_url("positivity") get_erviss_url("ili_ari_rates") get_erviss_url("nonsentinel_severity") # Snapshot URL get_erviss_url( "variants", use_snapshot = TRUE, snapshot_date = as.Date("2023-11-24") ) ``` Each source also has a dedicated URL function (e.g., `get_sentineltests_positivity_url()`, `get_ili_ari_rates_url()`, etc.). ## Using a local CSV file If you have already downloaded the data locally, you can pass the file path directly: ```{r local-csv, eval=FALSE} data <- get_sentineltests_positivity( csv_file = "path/to/sentinelTestsDetectionsPositivity.csv", date_min = as.Date("2024-01-01"), date_max = as.Date("2024-12-31") ) ``` ## Analyzing the data All functions return `data.table` objects. You can use `data.table` syntax or convert to a `data.frame` / `tibble` for your preferred workflow: ```{r analysis, eval=FALSE} data <- get_sentineltests_positivity( date_min = as.Date("2024-01-01"), date_max = as.Date("2024-06-30"), pathogen = c("SARS-CoV-2", "Influenza") ) # data.table syntax data[, .( mean_positivity = mean(value, na.rm = TRUE), max_positivity = max(value, na.rm = TRUE), n_weeks = .N ), by = .(countryname, pathogen) ] # Or convert to tibble for dplyr # tibble::as_tibble(data) ```