Introduction to DigestiveDataSets

library(DigestiveDataSets)
library(ggplot2)
library(dplyr)
#> 
#> Adjuntando el paquete: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

Introduction

The DigestiveDataSets package offers a comprehensive and curated collection of datasets focused on the digestive system, including the stomach, intestines, liver, pancreas, and related disorders. This package encompasses a variety of data types such as clinical trials, observational studies, experimental datasets, cohort data, and case series, providing broad coverage of gastrointestinal diseases.

Included datasets span a range of conditions such as gastritis, ulcers, pancreatitis, liver cirrhosis, colorectal diseases, colon cancer, Helicobacter pylori infection, irritable bowel syndrome, intestinal infections, and post-surgical outcomes. These datasets serve multiple purposes in education, clinical practice, and biomedical research, and are particularly valuable in fields such as gastroenterology, public health, and epidemiology.

Dataset Suffixes

Each dataset in the DigestiveDataSets package uses a suffix to denote the type of R object:

Below are selected example datasets included in the DigestiveDataSets package:

Data Visualization with DigestiveDataSets Data

Digestive Cancer Survival Times



# Minimal data processing WITHOUT tidyr
digestive_cancer_survival_df %>%
  summarise(
    Stomach = sum(stomach, na.rm = TRUE),
    Colon = sum(colon, na.rm = TRUE)
  ) %>% 
  # Convert to plot-ready format WITHOUT tidyr
  {data.frame(
    Cancer = names(.),
    Cases = unlist(., use.names = FALSE)
  )} %>% 
  # Plot
  ggplot(aes(x = Cancer, y = Cases)) +
  geom_col(fill = c("#e63946", "#1d3557")) +
  labs(title = "Stomach vs Colon Cancer Cases") +
  theme_minimal()

Campylobacter Infections Time Series.


# Convert ts to numeric vector and numeric time
campy_df <- data.frame(
  infections = as.numeric(campylobacter_infections_ts),
  time = as.numeric(time(campylobacter_infections_ts))
)

# Plot without warning by using numeric 'time'
ggplot(campy_df, aes(x = time, y = infections)) +
  geom_line(color = "steelblue") +
  geom_point(color = "steelblue") +
  labs(
    title = "Campylobacter Infections Over Time",
    x = "Time (year.fraction)",
    y = "Number of Infections"
  ) +
  theme_minimal()

Cholera Daily Deaths in England, 1849.


ggplot(cholera_deaths_1849_tbl_df, aes(x = date, y = deaths, color = cause_of_death)) +
  geom_line() +
  labs(
    title = "Cholera Deaths Over Time in 1849",
    x = "Date",
    y = "Number of Deaths",
    color = "Cause of Death"
  ) +
  theme_minimal()

Conclusion

The DigestiveDataSets package offers a comprehensive, well-curated collection of datasets focused on the digestive system and related diseases. By including clinical trials, observational studies, cohort data, and experimental datasets covering a broad spectrum of gastrointestinal conditions, this package supports a wide range of applications in research, education, and clinical practice.

For detailed information and full documentation of each dataset, please refer to the reference manual and help files included within the package.