Help for package osdc

Type:

Package

Title:

Open Source Diabetes Classifier for Danish Registers

Version:

0.11.3

Description:

The algorithm first identifies a population of individuals from Danish register data with any type of diabetes as individuals with two or more inclusion events. Then, it splits this population into individuals with either type 1 diabetes or type 2 diabetes by identifying individuals with type 1 diabetes and classifying the remainder of the diabetes population as having type 2 diabetes.

License:

MIT + file LICENSE

URL:

https://github.com/steno-aarhus/osdc, https://steno-aarhus.github.io/osdc/

BugReports:

https://github.com/steno-aarhus/osdc/issues

Depends:

R (≥ 4.2.0)

Imports:

checkmate, cli, codeCollection, dplyr (≥ 1.2.0), dbplyr (≥ 2.5.1), duckplyr (≥ 1.1.3), fabricatr, lifecycle, lubridate (≥ 1.9.5), purrr (≥ 1.2.1), rlang (≥ 1.1.7), stats, tidyselect (≥ 1.2.1), utils

Suggests:

glue, knitr, quarto, rmarkdown, spelling, stringr, testthat (≥ 3.0.0), tidyr, tibble, arrow (≥ 22.0.0.1), DBI (≥ 1.3.0)

VignetteBuilder:

quarto

Config/testthat/edition:

Encoding:

UTF-8

Language:

en-US

Config/roxygen2/version:

8.0.0

NeedsCompilation:

Packaged:

2026-06-04 11:29:43 UTC; luke

Author:

Signe Kirk Brødbæk

[aut], Anders Aasted Isaksen

[aut], Luke William Johnston

[aut, cre], Steno Diabetes Center Aarhus [cph], Aarhus University [cph]

Maintainer:

Luke William Johnston <lwjohnst@gmail.com>

Repository:

CRAN

Date/Publication:

2026-06-04 15:30:02 UTC

osdc: Open Source Diabetes Classifier for Danish Registers

Description

Author(s)

Maintainer: Luke William Johnston lwjohnst@gmail.com (ORCID)

Authors:

Luke William Johnston lwjohnst@gmail.com (ORCID)
Signe Kirk Brødbæk signekb@clin.au.dk (ORCID)
Anders Aasted Isaksen andaas@rm.dk (ORCID)

Other contributors:

Steno Diabetes Center Aarhus [copyright holder]
Aarhus University [copyright holder]

A list of the algorithmic logic underlying osdc.

Description

This nested list contains the logic details of the algorithm.

Usage

algorithm()

Format

Is a list with nested lists that have these named elements:

register: Optional. The register used for this logic
title: The title to use when displaying the logic in tables.
logic: The logic itself.
comments: Some additional comments on the logic.

Value

A nested list with the algorithmic logic. Contains fields register, title, logic, and comments.

Examples

algorithm()$is_hba1c_over_threshold
algorithm()$is_gld_code$logic

Classify diabetes status using Danish registers.

Description

This function requires that each source of register data is represented as a single DuckDB object in R (e.g. a connection to Parquet files). Each DuckDB object must contain a single table covering all years of that data source, or at least the years you have and are interested in.

Usage

classify_diabetes(
  lpr,
  hsr,
  lab_forsker,
  bef,
  lmdb,
  stable_inclusion_start_date = "1998-01-01"
)

Arguments

lpr

The unified LPR register, see join_registers()

hsr

The unified health services registers (SYSI and SSSY), see join_registers()

lab_forsker

The register for laboratory results for research

bef

The BEF table from the civil register

lmdb

The LMDB table from the prescription register

stable_inclusion_start_date

Cutoff date after which inclusion events are considered true incident diabetes cases. Defaults to "1998-01-01", which is one year after the data on pregnancy events from the Patient Register are considered valid for dropping gestational diabetes-related purchases of glucose-lowering drugs. This default assumes that the user is using LPR and LMDB data from at least Jan 1 1997 onward. If the user only has access to LPR and LMDB data from a later date, this parameter should be set to one year after the beginning of the user's data coverage.

Value

The same object type as the input data, which would be a duckplyr::duckdb_tibble() type object.

Create a synthetic dataset of edge case inputs

Description

This function generates a list of tibbles representing the Danish health registers and the data necessary to run the algorithm. The dataset contains 23 individual cases (pnrs), each designed to test a specific logical branch of the diabetes classification algorithm, including inclusion, exclusion, censoring, and type classification rules.

The generated data is used in testthat tests to ensure the algorithm behaves as expected under a wide range of conditions, but it is also intended to be explored by users to better understand how the algorithm logic works.

Usage

edge_cases()

Value

A named list of 9 tibble::tibble() objects, each representing a different health register: bef, lmdb, lpr_adm, lpr_diag, lpr3a_kontakt, lpr3a_diagnose, lpr3f_kontakter, lpr3f_diagnoser, sysi, sssy, and lab_forsker.

Examples

edge_cases()

Join prepared registers

Description

Join prepared registers

Usage

join_registers(register_list)

Arguments

register_list

A list of the prepared registers, from e.g. prepare_lpr2().

Value

A single object with all rows from each register in register_list.

Examples

register_data <- simulate_registers(c(
  "lpr_adm",
  "lpr_diag",
  "lpr3f_kontakter",
  "lpr3f_diagnoser",
  "sssy",
  "sysi"
))
join_registers(list(
  prepare_lpr2(register_data$lpr_adm, register_data$lpr_diag),
  prepare_lpr3f(
    register_data$lpr3f_kontakter,
    register_data$lpr3f_diagnoser
  )
))
join_registers(list(register_data$sysi, register_data$sssy))

List of non-cases to test the diabetes classification algorithm

Description

This function generates a list of tibbles representing the Danish health registers and the data necessary to run the algorithm. The dataset contains individuals who should not be included in the final classified cohort.

Usage

non_cases()

Details

Value

Examples

non_cases()

Description of the different non-cases included in `non_cases()`

Description

All cases, aside from what would exclude them from being classified as described in the metadata here, would otherwise be classified as having diabetes.

Usage

non_cases_metadata()

Value

A named list of character strings, where each name corresponds to a non-case PNR in the dataset generated by non_cases().

Examples

non_cases_metadata()

Prepare and join the two LPR2 registers to extract diabetes and pregnancy diagnoses.

Description

Prepare and join the two LPR2 registers to extract diabetes and pregnancy diagnoses.

Usage

prepare_lpr2(lpr_adm, lpr_diag)

Arguments

lpr_adm

The LPR2 register containing hospital admissions.

lpr_diag

The LPR2 register containing diabetes diagnoses.

Value

The same type as the input data, as a duckplyr::duckdb_tibble(), with the following columns:

pnr: The personal identification variable.
date: The date of all the recorded diagnosis (renamed from d_inddto or dato_start).
is_primary_diagnosis: Whether the diagnosis was a primary diagnosis.
is_diabetes_code: Whether the diagnosis was any type of diabetes.
is_t1d_code: Whether the diagnosis was T1D-specific.
is_t2d_code: Whether the diagnosis was T2D-specific.
is_pregnancy_code: Whether the person has an event related to pregnancy like giving birth or having a miscarriage at the given date.
is_endocrinology_dept: Whether the diagnosis was made by an endocrinology medical department.
is_medical_dept: Whether the diagnosis was made by a non-endocrinology medical department.

Prepare and join the two LPR3A registers to extract diabetes and pregnancy diagnoses.

Description

Prepare and join the two LPR3A registers to extract diabetes and pregnancy diagnoses.

Usage

prepare_lpr3a(lpr3a_kontakt, lpr3a_diagnose)

Arguments

lpr3a_kontakt

The LPR3A register containing hospital contacts/admissions.

lpr3a_diagnose

The LPR3A register containing diabetes diagnoses.

Value

The same type as the input data, as a duckplyr::duckdb_tibble(), with the following columns:

pnr: The personal identification variable.
date: The date of all the recorded diagnosis (renamed from d_inddto or dato_start).
is_primary_diagnosis: Whether the diagnosis was a primary diagnosis.
is_diabetes_code: Whether the diagnosis was any type of diabetes.
is_t1d_code: Whether the diagnosis was T1D-specific.
is_t2d_code: Whether the diagnosis was T2D-specific.
is_pregnancy_code: Whether the person has an event related to pregnancy like giving birth or having a miscarriage at the given date.
is_endocrinology_dept: Whether the diagnosis was made by an endocrinology medical department.
is_medical_dept: Whether the diagnosis was made by a non-endocrinology medical department.

Prepare and join the two LPR3F registers to extract diabetes and pregnancy diagnoses.

Description

Prepare and join the two LPR3F registers to extract diabetes and pregnancy diagnoses.

Usage

prepare_lpr3f(lpr3f_kontakter, lpr3f_diagnoser)

Arguments

lpr3f_kontakter

The LPR3F register containing hospital contacts/admissions.

lpr3f_diagnoser

The LPR3F register containing diabetes diagnoses.

Value

The same type as the input data, as a duckplyr::duckdb_tibble(), with the following columns:

pnr: The personal identification variable.
date: The date of all the recorded diagnosis (renamed from d_inddto or dato_start).
is_primary_diagnosis: Whether the diagnosis was a primary diagnosis.
is_diabetes_code: Whether the diagnosis was any type of diabetes.
is_t1d_code: Whether the diagnosis was T1D-specific.
is_t2d_code: Whether the diagnosis was T2D-specific.
is_pregnancy_code: Whether the person has an event related to pregnancy like giving birth or having a miscarriage at the given date.
is_endocrinology_dept: Whether the diagnosis was made by an endocrinology medical department.
is_medical_dept: Whether the diagnosis was made by a non-endocrinology medical department.

Register variables (with descriptions) required for the osdc algorithm.

Description

Usage

registers()

Value

Outputs a list of registers and variables required by osdc. Each list item contains the official Danish name of the register, the start year, the end year, and the variables with their descriptions. Each register item is a list with 4 items:

name: The official name of the variable found in the register.
danish_description: The official Danish description of the variable.
english_description: The translated English description of the variable.
data_type: The data type, e.g. "character" of the variable.

Source

Many of the details within the registers() metadata come from the full official list of registers from Statistics Denmark (DST): https://www.dst.dk/extranet/forskningvariabellister/Oversigt%20over%20registre.html

Examples

registers()

Simulate a fake data frame of one or more Danish registers

Description

Simulate a fake data frame of one or more Danish registers

Usage

simulate_registers(registers, n = 1000)

Arguments

registers

The name of the register you want to simulate.

n

The number of rows to simulate for the resulting register.

Value

A list with simulated register data, as a tibble::tibble().

Examples

simulate_registers(c("bef", "sysi"))
simulate_registers("bef")

Package {osdc}

osdc: Open Source Diabetes Classifier for Danish Registers

Description

Author(s)

See Also

A list of the algorithmic logic underlying osdc.

Description

Usage

Format

Value

See Also

Examples

Classify diabetes status using Danish registers.

Description

Usage

Arguments

Value

See Also

Create a synthetic dataset of edge case inputs

Description

Usage

Value

Examples

Join prepared registers

Description

Usage

Arguments

Value

Examples

List of non-cases to test the diabetes classification algorithm

Description

Usage

Details

Value

Examples

Description of the different non-cases included in non_cases()

Description

Usage

Value

Examples

Prepare and join the two LPR2 registers to extract diabetes and pregnancy diagnoses.

Description

Usage

Arguments

Value

See Also

Prepare and join the two LPR3A registers to extract diabetes and pregnancy diagnoses.

Description

Usage

Arguments

Value

See Also

Prepare and join the two LPR3F registers to extract diabetes and pregnancy diagnoses.

Description

Usage

Arguments

Value

See Also

Register variables (with descriptions) required for the osdc algorithm.

Description

Usage

Value

Source

Examples

Simulate a fake data frame of one or more Danish registers

Description

Usage

Arguments

Value

Examples

Description of the different non-cases included in `non_cases()`