The nemsqar package provides an automated and
reproducible framework for calculating EMS quality measures defined by
the National EMS Quality Alliance (NEMSQA). These measures are widely
used by EMS agencies, trauma systems, quality improvement teams, and
researchers to evaluate performance and support evidence‑based
improvement activities.
This vignette is written for users who are knowledgeable in EMS,
injury epidemiology, or quality improvement, but who may be new to R or
new to calculating NEMSQA measures using R. The focus is to guide you
through each step of the workflow: loading data, preparing it for use
with nemsqar, and running a selected NEMSQA measure.
By the end of this vignette, you will understand:
nemsqarnemsqarThe sections that follow walk through the entire process, from loading EMS data in R to producing a standardized performance measure aligned with national reporting expectations.
This vignette focuses on one NEMSQA measure implemented in
nemsqar:
All measures in nemsqar follow the same basic structure.
Each one requires a defined set of NEMSIS‑aligned tables and returns
results in a standardized format. The following sections introduce the
data required to calculate this measure and demonstrate how to run it in
R.
Before calculating any measure, it is important to understand the
example datasets included with nemsqar. These datasets are
small, synthetic representations of NEMSIS tables. They provide a safe
environment for learning the workflow before applying it to production
EMS data. Because NEMSQA measure logic relies on the NEMSIS data
structure, each measure requires several tables (for example, patient,
response, situation, medications, vitals).
The nemsqar package includes synthetic datasets that
mirror the NEMSIS tables required for NEMSQA measure calculation. You
can load them with the data() function. For Asthma‑01, the
following tables are required:
data("nemsqar_patient_scene_table")
data("nemsqar_response_table")
data("nemsqar_situation_table")
data("nemsqar_medications_table")Each dataset loads into your R environment as a standard data frame.
In practice, you will typically load your own EMS datasets. These may come from CSV files, databases, or data extracts. Below are common patterns for loading local files:
Because many users are new to R, it is important to verify that each dataset loaded correctly and contains the variables required by the measure functions. A few simple commands help confirm this.
# Quick overview of column names and data types
dplyr::glimpse(nemsqar_patient_scene_table)
#> Rows: 10,000
#> Columns: 6
#> $ `Incident Patient Care Report Number - PCR (eRecord.01)` <chr> "NyXFBlJfnm-8…
#> $ `Incident Date` <date> 2023-12-20, …
#> $ `Patient Age (ePatient.15)` <dbl> 98, 75, 24, 1…
#> $ `Patient Age Units (ePatient.16)` <chr> "Minutes", "D…
#> $ `Patient Date Of Birth (ePatient.17)` <date> 2023-12-19, …
#> $ `Patient Gender (ePatient.13)` <chr> "Male to Fema…# An abbreviated look at the actual data tables
head(nemsqar_patient_scene_table, n = 10)
#> # A tibble: 10 × 6
#> Incident Patient Care Report Number …¹ `Incident Date` Patient Age (ePatien…²
#> <chr> <date> <dbl>
#> 1 NyXFBlJfnm-8333586176 2023-12-20 98
#> 2 XTLCINMLTP-8616021114 2023-08-30 75
#> 3 HfYjlIEQSk-9529756610 2023-03-21 24
#> 4 MOwVDhriyC-5915613206 2023-09-13 115
#> 5 ZCGOtLEPKw-7820135532 2023-02-21 54
#> 6 fEMvUCQCRQ-9052388486 2023-08-23 88
#> 7 VTLPiFWWGd-6806896482 2023-12-09 83
#> 8 YvZbHRTUuK-8780915452 2023-04-06 24
#> 9 DkKIjJSFtA-7499641828 2023-05-07 95
#> 10 CIQMuVGJgS-9144926148 2023-11-13 82
#> # ℹ abbreviated names:
#> # ¹`Incident Patient Care Report Number - PCR (eRecord.01)`,
#> # ²`Patient Age (ePatient.15)`
#> # ℹ 3 more variables: `Patient Age Units (ePatient.16)` <chr>,
#> # `Patient Date Of Birth (ePatient.17)` <date>,
#> # `Patient Gender (ePatient.13)` <chr>These functions allow you to check column names, data types, and basic record structure. This step is essential because NEMSQA logic depends on specific fields. Incorrect data types (for example, character instead of numeric) will cause measure functions to fail.
The example datasets included in nemsqar already use
appropriate data types. When working with real EMS data, you must verify
these types manually. This ensures that the data satisfy validation
requirements and prevents errors during measure calculation.
In practice, EMS data commonly include issues such as:
"5" instead
of 5)NABelow are examples of how to identify and correct these issues before
running any nemsqar measure.
# Example: incident dates stored as character values
example_data <- data.frame(
Incident_Date = c("2023-01-10", "01/12/2023", "20230114"),
stringsAsFactors = FALSE
)
# Convert using lubridate (recommended)
example_data$Incident_Date <- lubridate::parse_date_time(
example_data$Incident_Date,
orders = c("ymd", "mdy", "Ymd")
)Date fields are used for patient age computation and for time‑based denominators. If these values are stored as character strings or in inconsistent formats, the measure logic will not execute correctly.
Numeric fields such as age, blood pressure, respiratory rate, or dosage must be numeric to satisfy validation checks and ensure appropriate comparisons.
Empty strings cause false exclusions during population filtering,
especially when nemsqar logic expects missing values to be
formally represented as NA.
Ensuring correct data types prior to running any nemsqar function improves reproducibility, reduces debugging time, and allows the measure logic to operate as intended.
If your datasets already have clean names, you may skip this step.
EMS registry data often contain column names with spaces, punctuation, or special characters. These can make programming in R more difficult. To avoid these issues, it is helpful to standardize column names before running any measures.
Below is a simple reusable function to clean column names by replacing spaces and special characters with underscores.
# Define a reusable column-cleaning function
clean_cols <- function(data) {
data |>
dplyr::rename_with(
.cols = tidyselect::everything(),
~ . |>
gsub(pattern = "\\.|\\(|-|\\s", replacement = "_") |>
gsub(pattern = "_+", replacement = "_") |>
gsub(pattern = "\\)", replacement = "")
)
}
# Apply cleaning to each table
nemsqar_patient_scene_data <- nemsqar_patient_scene_table |> clean_cols()
nemsqar_response_data <- nemsqar_response_table |> clean_cols()
nemsqar_situation_data <- nemsqar_situation_table |> clean_cols()
nemsqar_medications_data <- nemsqar_medications_table |> clean_cols()
# Inspect the cleaned patient/scene table
dplyr::glimpse(nemsqar_patient_scene_data)
#> Rows: 10,000
#> Columns: 6
#> $ Incident_Patient_Care_Report_Number_PCR_eRecord_01 <chr> "NyXFBlJfnm-8333586…
#> $ Incident_Date <date> 2023-12-20, 2023-0…
#> $ Patient_Age_ePatient_15 <dbl> 98, 75, 24, 115, 54…
#> $ Patient_Age_Units_ePatient_16 <chr> "Minutes", "Days", …
#> $ Patient_Date_Of_Birth_ePatient_17 <date> 2023-12-19, 2023-0…
#> $ Patient_Gender_ePatient_13 <chr> "Male to Female, Tr…Now, special characters and whitespace are either removed or replaced
with _ so R can more easily recognize the column names, and
we can avoid annoying conventions to find column names.
Each NEMSQA measure requires a specific set of input tables. Although
nemsqar can accept a single combined dataset through the
df argument, this approach is not recommended. The
preferred workflow is to supply separate tables using the
*_table arguments (for example,
patient_scene_table, response_table). This
aligns with the NEMSIS structure, where elements such as ePatient,
eScene, eResponse, and eSituation are stored in distinct tables.
In practice, your data should follow this multi‑table structure:
Each measure expects a consistent set of these tables. For example:
The next sections demonstrate how to supply these inputs to
nemsqar and how to calculate a measure.
nemsqarOnce the required tables are loaded, you can calculate your first
measure. Each measure in nemsqar is implemented through a
dedicated function that accepts NEMSIS‑aligned tables and returns
standardized results.
nemsqar workhorse functionsEach measure is built using two core functions:
measure_##()
(for example, asthma_01())asthma_01_population()The wrapper function performs two main tasks. First, it calls the population function to identify the population of interest. Then it applies the measure logic to estimate performance. Each NEMSQA measure follows this same pattern.
The asthma_01() function requires several NEMSIS‑aligned
tables and column mappings. All arguments shown below are required. Each
column argument identifies the specific NEMSIS field used by the measure
logic. Note that most argument names signal the corresponding NEMSIS
data element. For example, eresponse_05_col corresponds to
eResponse.05 in the NEMSIS data dictionary.
To help you map your own data, the list below shows how several key arguments align with their corresponding NEMSIS elements:
erecord_01_col –> eRecord.01 (PCR number)incident_date_col –> eTimes.03 (Unit Notified by
Dispatch Date/Time)patient_DOB_col –> ePatient.17 (patient date of
birth)epatient_15_col –> ePatient.15 (patient age)epatient_16_col –> ePatient.16 (age units)eresponse_05_col –> eResponse.05 (type of service
requested)esituation_11_col –> eSituation.11 (primary
impression)esituation_12_col –> eSituation.12 (secondary
impression)emedications_03_col –> eMedications.03 (medication
administered)These mappings ensure that each argument references the correct NEMSIS data element when running the measure.
# Run Asthma‑01 without grouping
asthma_01_all <- asthma_01(
patient_scene_table = nemsqar_patient_scene_data,
response_table = nemsqar_response_data,
situation_table = nemsqar_situation_data,
medications_table = nemsqar_medications_data,
erecord_01_col = Incident_Patient_Care_Report_Number_PCR_eRecord_01,
incident_date_col = Incident_Date,
patient_DOB_col = Patient_Date_Of_Birth_ePatient_17,
epatient_15_col = Patient_Age_ePatient_15,
epatient_16_col = Patient_Age_Units_ePatient_16,
eresponse_05_col = Response_Type_Of_Service_Requested_With_Code_eResponse_05,
esituation_11_col = Situation_Provider_Primary_Impression_Code_And_Description_eSituation_11,
esituation_12_col = Situation_Provider_Secondary_Impression_Description_And_Code_List_eSituation_12,
emedications_03_col = Patient_Medication_Given_or_Administered_Description_And_RXCUI_Codes_List_eMedications_03,
confidence_interval = TRUE,
method = "clopper-pearson",
conf.level = 0.95
)
# print the results
asthma_01_all
#> # A tibble: 3 × 8
#> measure pop numerator denominator prop prop_label lower_ci upper_ci
#> <chr> <chr> <int> <int> <dbl> <chr> <dbl> <dbl>
#> 1 Asthma-01 Adults 0 4 0 0% 0 0.602
#> 2 Asthma-01 Peds 3 25 0.12 12% 0.0255 0.312
#> 3 Asthma-01 All 3 29 0.103 10.34% 0.0219 0.274The output reflects the measure population, denominator, numerator,
and final performance classification for each record. This structure is
consistent across all NEMSQA measures implemented in
nemsqar.
asthma_01 wrapper function using
groupingnemsqar allows you to calculate a measure for the entire
dataset or for specific subgroups. Grouping can be useful when you want
to understand performance within meaningful categories, such as age
groups, service types, or impressions. Grouping is implemented using the
.by argument, which follows the same syntax used in
dplyr::summarize().
The example below shows how to run Asthma‑01 grouped
by age units. All required tables and column mappings remain the same;
the only additional argument is .by.
# Run `asthma_01` for a whole dataset, group by age units.
# All core inputs remain the same. Only the .by argument is added.
asthma_01_age <- asthma_01(
patient_scene_table = nemsqar_patient_scene_data,
response_table = nemsqar_response_data,
situation_table = nemsqar_situation_data,
medications_table = nemsqar_medications_data,
erecord_01_col = Incident_Patient_Care_Report_Number_PCR_eRecord_01,
incident_date_col = Incident_Date,
patient_DOB_col = Patient_Date_Of_Birth_ePatient_17,
epatient_15_col = Patient_Age_ePatient_15,
epatient_16_col = Patient_Age_Units_ePatient_16,
eresponse_05_col = Response_Type_Of_Service_Requested_With_Code_eResponse_05,
esituation_11_col = Situation_Provider_Primary_Impression_Code_And_Description_eSituation_11,
esituation_12_col = Situation_Provider_Secondary_Impression_Description_And_Code_List_eSituation_12,
emedications_03_col = Patient_Medication_Given_or_Administered_Description_And_RXCUI_Codes_List_eMedications_03,
confidence_interval = TRUE,
method = "clopper-pearson",
conf.level = 0.95,
# notice here that we use the `.by` argument from `dplyr::summarize` to group
# our analysis
.by = Patient_Age_Units_ePatient_16
)
# print the results
asthma_01_age
#> # A tibble: 10 × 9
#> Patient_Age_Units_ePa…¹ measure pop numerator denominator prop prop_label
#> <chr> <chr> <chr> <int> <int> <dbl> <chr>
#> 1 Years Asthma… Adul… 0 4 0 0%
#> 2 Months Asthma… Peds 1 12 0.0833 8.33%
#> 3 Minutes Asthma… Peds 2 9 0.222 22.22%
#> 4 Hours Asthma… Peds 0 2 0 0%
#> 5 Days Asthma… Peds 0 2 0 0%
#> 6 Months Asthma… All 1 12 0.0833 8.33%
#> 7 Minutes Asthma… All 2 9 0.222 22.22%
#> 8 Hours Asthma… All 0 2 0 0%
#> 9 Years Asthma… All 0 4 0 0%
#> 10 Days Asthma… All 0 2 0 0%
#> # ℹ abbreviated name: ¹Patient_Age_Units_ePatient_16
#> # ℹ 2 more variables: lower_ci <dbl>, upper_ci <dbl>Grouping is optional, and can reveal differences in performance
across patient subpopulations and can be applied to any NEMSQA measure
using the same .by syntax.
*_population() functionsEach NEMSQA measure includes a companion *_population()
function. These functions identify the population of interest by
applying the full set of inclusion and exclusion criteria defined by
NEMSQA. They perform all filtering, validation, and intermediate
computations needed to determine which records belong in the measure
denominator.
Each population function returns a list containing
several tibbles that help you examine the population:
These objects are useful when validating data quality, understanding how records flowed through the NEMSQA criteria, and troubleshooting unexpected measure results. In practice, population functions are most useful when you need to verify which records were included or excluded from the denominator and why. Analysts often use these functions when denominator counts look unexpected, when investigating data quality issues, or when comparing populations across systems or years. They provide a transparent view of how NEMSQA logic was applied to your data.
The example below demonstrates how to use
asthma_01_population() to inspect the population identified
for Asthma‑01.
asthma_01_population() to examine the target
populationThe asthma_01_population() function identifies the
population of interest by applying all NEMSQA inclusion and exclusion
criteria. The function uses the same required tables and column mappings
as asthma_01(), but it does not calculate performance
estimates and does not use confidence interval or grouping arguments
# Run `asthma_01_population` for a whole dataset
# The code is virtually the same as `asthma_01()`, but we do not use the
# confidence interval arguments, nor the tidy dot `...` arguments for grouping
# or other operations via `dplyr::summarize`
populations_asthma_01 <- asthma_01_population(
patient_scene_table = nemsqar_patient_scene_data,
response_table = nemsqar_response_data,
situation_table = nemsqar_situation_data,
medications_table = nemsqar_medications_data,
erecord_01_col = Incident_Patient_Care_Report_Number_PCR_eRecord_01,
incident_date_col = Incident_Date,
patient_DOB_col = Patient_Date_Of_Birth_ePatient_17,
epatient_15_col = Patient_Age_ePatient_15,
epatient_16_col = Patient_Age_Units_ePatient_16,
eresponse_05_col = Response_Type_Of_Service_Requested_With_Code_eResponse_05,
esituation_11_col = Situation_Provider_Primary_Impression_Code_And_Description_eSituation_11,
esituation_12_col = Situation_Provider_Secondary_Impression_Description_And_Code_List_eSituation_12,
emedications_03_col = Patient_Medication_Given_or_Administered_Description_And_RXCUI_Codes_List_eMedications_03
)
# print structure of the results using `base::summary()`
populations_asthma_01 |> summary()
#> Length Class Mode
#> filter_process 2 tbl_df list
#> adults 16 tbl_df list
#> peds 16 tbl_df list
#> initial_population 16 tbl_df list
#> computing_population 16 tbl_df list
#> missingness 6 tbl_df listThis output provides a structured view of how records were filtered through the NEMSQA criteria. It allows you to inspect the initial population, denominator‑eligible records, age‑specific subgroups, and missingness summaries for required fields.
The *_population() functions return several tibbles that
summarize how records were filtered into the final population of
interest. One of the most useful is the filter_process
tibble. It shows the number of records remaining after each inclusion or
exclusion step defined by NEMSQA.
# Display counts for each filtering step
populations_asthma_01$filter_process
#> # A tibble: 7 × 2
#> filter count
#> <chr> <int>
#> 1 911 calls 2400
#> 2 Asthma cases 109
#> 3 Beta agonist cases 1482
#> 4 Adults denominator 4
#> 5 Peds denominator 25
#> 6 Initial population 29
#> 7 Total dataset 10000filter_process is typically where analysts can look
first when values seem off.
Given that this vignette uses synthetic data, the counts may not
reflect realistic populations. However, the workflow remains the same
when working with real EMS data. The values in filter_process represent
distinct record counts at each stage (using
dplyr::distinct() internally). Reviewing these counts,
along with the missingness tibble returned by the population function,
can help diagnose data quality issues and better understand the
composition of the population being evaluated.
Users who are new to R often encounter several predictable issues when preparing data for NEMSQA measure calculation. The sections below highlight the most common problems and how to avoid them. Addressing these issues before running measures improves reproducibility and reduces debugging time.
Many NEMSQA logic components require numeric fields. If these values
are imported as character strings, the measure functions will fail.
Always verify column types before running a measure and convert them as
needed to meet nemsqar validation requirements.
Column names must align with the NEMSIS fields that each function
argument represents. You may name your columns however you prefer, but
the values that originate from eResponse.05 must be
supplied to the eresponse_05_col argument. The function
relies on the data itself, not the literal column name, but incorrect
mapping will cause errors.
Each measure requires a specific set of input tables. If a required table is not provided, the function will return an error. Ensure that all necessary tables are loaded and cleaned before running the measure.
NEMSQA measures assume that each patient or encounter appears once in
the relevant input tables. Duplicate rows can shift denominator counts,
alter inclusion, or create unintended exclusions. Although
nemsqar includes safeguards to detect some duplication, it
is best practice to review your data for repeated records and to check
for unintended Cartesian joins created during data extraction or table
merging.
This vignette introduced the core workflow for calculating NEMSQA
measures using nemsqar. After reviewing these examples,
users may wish to expand their analyses by exploring additional
measures, integrating their own EMS datasets, or incorporating these
workflows into automated reporting pipelines. The package reference
documentation provides detailed descriptions of each function, and
additional vignettes will demonstrate multi‑measure workflows,
validation strategies, and integration with reproducible reporting tools
such as Quarto and R Markdown.