DDESONN vs Keras — 1000-Seed Summary — Heart Failure

Overview

This vignette summarizes DDESONN results across 1000 randomized seeds (two separate 500-seed runs) and compares them against a Keras benchmark summary stored in an Excel workbook bundled with the package.

The purpose of this benchmark is not to showcase a single favorable run. Instead, it evaluates distributional behavior across many random initializations, with emphasis on:

In this context, stronger stability across seeds is important because it indicates that the training procedure is less sensitive to random initialization and therefore more dependable at scale.

Where the demo data lives

The four RDS artifacts included with the package are stored under:

inst/extdata/heart_failure_runs/
├─ run1/
│  ├─ SingleRun_Train_Acc_Val_Metrics_500_seeds_20251025.rds
│  └─ SingleRun_Test_Metrics_500_seeds_20251025.rds
└─ run2/
   ├─ SingleRun_Train_Acc_Val_Metrics_500_seeds_20251026.rds
   └─ SingleRun_Test_Metrics_500_seeds_20251026.rds

Each folder represents one 500-seed run performed locally; together they form the 1000-seed composite.

Motivation and comparison philosophy

This benchmark addresses a focused research question:

Can a fully R-native, from-first-principles neural network implementation achieve competitive statistical stability relative to an established deep-learning framework under repeated randomized initialization?

The Keras comparison is included as a reference benchmark, not as an implementation template. DDESONN was built independently from scratch and was not derived from Keras source code.

Load DDESONN runs and build the summary

suppressPackageStartupMessages({
  library(dplyr)
  library(tibble)
  library(knitr)
})

if (!requireNamespace("DDESONN", quietly = TRUE)) {
  message("DDESONN not installed in this build session; skipping evaluation.")
  knitr::opts_chunk$set(eval = FALSE)
}
.render_tbl <- function(x, title = NULL, digits = 4) {
  if (requireNamespace("DDESONN", quietly = TRUE) &&
      exists("ddesonn_viewTables", envir = asNamespace("DDESONN"), inherits = FALSE)) {
    get("ddesonn_viewTables", envir = asNamespace("DDESONN"))(x, title = title)
  } else {
    if (!is.null(title)) cat("\n\n###", title, "\n\n")
    knitr::kable(x, digits = digits, format = "html")
  }
}
heart_failure_root <- system.file("extdata", "heart_failure_runs", package = "DDESONN")

if (!nzchar(heart_failure_root)) {
  # Fallback when building from source before installation
  heart_failure_root <- file.path("..", "inst", "extdata", "heart_failure_runs")
}

stopifnot(dir.exists(heart_failure_root))

train_run1_path <- file.path(
  heart_failure_root, "run1",
  "SingleRun_Train_Acc_Val_Metrics_500_seeds_20251025.rds"
)
test_run1_path <- file.path(
  heart_failure_root, "run1",
  "SingleRun_Test_Metrics_500_seeds_20251025.rds"
)
train_run2_path <- file.path(
  heart_failure_root, "run2",
  "SingleRun_Train_Acc_Val_Metrics_500_seeds_20251026.rds"
)
test_run2_path <- file.path(
  heart_failure_root, "run2",
  "SingleRun_Test_Metrics_500_seeds_20251026.rds"
)

stopifnot(
  file.exists(train_run1_path),
  file.exists(test_run1_path),
  file.exists(train_run2_path),
  file.exists(test_run2_path)
)

train_run1 <- readRDS(train_run1_path)
test_run1  <- readRDS(test_run1_path)
train_run2 <- readRDS(train_run2_path)
test_run2  <- readRDS(test_run2_path)

train_all <- dplyr::bind_rows(train_run1, train_run2)
test_all  <- dplyr::bind_rows(test_run1, test_run2)

train_seed <- train_all %>%
  group_by(seed) %>%
  slice_max(order_by = best_val_acc, n = 1, with_ties = FALSE) %>%
  ungroup() %>%
  transmute(
    seed,
    train_acc = best_train_acc,
    val_acc   = best_val_acc
  )

test_seed <- test_all %>%
  group_by(seed) %>%
  slice_max(order_by = accuracy, n = 1, with_ties = FALSE) %>%
  ungroup() %>%
  transmute(
    seed,
    test_acc = accuracy
  )

merged <- inner_join(train_seed, test_seed, by = "seed") %>%
  arrange(seed)

summarize_column <- function(x) {
  pct <- function(p) stats::quantile(x, probs = p, names = FALSE, type = 7)
  data.frame(
    count = length(x),
    mean  = mean(x),
    std   = sd(x),
    min   = min(x),
    `25%` = pct(0.25),
    `50%` = pct(0.50),
    `75%` = pct(0.75),
    max   = max(x),
    check.names = FALSE
  )
}

summary_train <- summarize_column(merged$train_acc)
summary_val   <- summarize_column(merged$val_acc)
summary_test  <- summarize_column(merged$test_acc)

summary_all <- data.frame(
  stat = c("count","mean","std","min","25%","50%","75%","max"),
  train_acc = unlist(summary_train[1, ]),
  val_acc   = unlist(summary_val[1, ]),
  test_acc  = unlist(summary_test[1, ]),
  check.names = FALSE
)

round4 <- function(x) if (is.numeric(x)) round(x, 4) else x
pretty_summary <- as.data.frame(lapply(summary_all, round4))

.render_tbl(
  pretty_summary,
  title = "DDESONN — 1000-seed summary (train/val/test)"
)
stat train_acc val_acc test_acc
count 1000.0000 1000.0000 1000.0000
mean 0.9928 0.9992 0.9992
std 0.0014 0.0013 0.0013
min 0.9854 0.9893 0.9920
25% 0.9920 0.9987 0.9987
50% 0.9929 1.0000 1.0000
75% 0.9937 1.0000 1.0000
max 0.9963 1.0000 1.0000

Keras parity (Excel, Sheet 2)

Keras parity results are stored in an Excel workbook included with the package under:

inst/scripts/vsKeras/1000SEEDSRESULTSvsKeras/1000seedsKeras.xlsx

The file is accessed programmatically using system.file() so the path remains CRAN-safe and cross-platform.

if (!requireNamespace("readxl", quietly = TRUE)) {
  message("Skipping keras-summary chunk: 'readxl' not installed.")
} else {
  keras_path <- system.file(
    "scripts", "vsKeras", "1000SEEDSRESULTSvsKeras", "1000seedsKeras.xlsx",
    package = "DDESONN"
  )

  if (nzchar(keras_path) && file.exists(keras_path)) {
    keras_stats <- readxl::read_excel(keras_path, sheet = 2)
    .render_tbl(
      keras_stats,
      title = "Keras — 1000-seed summary (Sheet 2)"
    )
  } else {
    cat("Keras Excel not found in installed package.\n")
  }
}
stat seed train_loss train_acc val_loss val_acc val_auc val_auprc test_loss test_acc test_auc test_auprc
count 1000.0000 1000.0000000 1000.0000000 1000.0000000 1000.0000000 1000.0000000 1.00000e+03 1000.0000000 1000.0000000 1000.0000000 1000.0000000
mean 500.0999 0.1285539 0.9853164 0.0923288 0.9943097 0.9981954 9.97682e-01 0.0801902 0.9968086 0.9992427 0.9989122
std 288.9524 0.2685126 0.0031003 0.1312915 0.0048215 0.0046459 6.14290e-03 0.1931197 0.0035695 0.0031493 0.0052804
min 1.0000 0.0810060 0.9705710 0.0511130 0.9653330 0.9612140 9.03136e-01 0.0498250 0.9786670 0.9691980 0.8900960
25% 250.0000 0.0951690 0.9837140 0.0653810 0.9920000 0.9989820 9.98331e-01 0.0591380 0.9946670 0.9999330 0.9998530
50% 500.0000 0.1022350 0.9857140 0.0709260 0.9960000 0.9997410 9.99473e-01 0.0629980 0.9973330 1.0000000 1.0000000
75% 750.0000 0.1121150 0.9874290 0.0812150 0.9986670 0.9999420 9.99873e-01 0.0697250 1.0000000 1.0000000 1.0000000
max 1000.0000 6.0387850 0.9925710 2.1582980 1.0000000 1.0000000 1.00000e+00 5.5763540 1.0000000 1.0000000 1.0000000

🔬 Benchmark results across 1000 seeds

Across 1000 random neural network initializations, DDESONN demonstrated stronger stability than the Keras benchmark model on this heart-failure task.

benchmark_results <- data.frame(
  Metric = c(
    "Mean Test Accuracy",
    "Standard Deviation",
    "Minimum Test Accuracy",
    "Maximum Test Accuracy"
  ),
  DDESONN = c("≈ 99.92%", "≈ 0.0013", "≈ 99.20%", "100%"),
  Keras   = c("≈ 99.69%", "≈ 0.0036", "≈ 97.82%", "100%"),
  check.names = FALSE
)

.render_tbl(
  benchmark_results,
  title = "Benchmark results across 1000 seeds"
)
Metric DDESONN Keras
Mean Test Accuracy ≈ 99.92% ≈ 99.69%
Standard Deviation ≈ 0.0013 ≈ 0.0036
Minimum Test Accuracy ≈ 99.20% ≈ 97.82%
Maximum Test Accuracy 100% 100%

These results suggest that DDESONN achieved:

This is important because lower variance implies the model is less sensitive to randomized initialization and more dependable across repeated training runs.

Why this matters for large-scale projects

Enterprise machine learning pipelines

In large corporate environments, teams may train hundreds or thousands of models across changing datasets, validation windows, and deployment cycles. A lower-variance model reduces the need for repeated retraining simply to obtain a “good seed,” which lowers compute cost and improves operational predictability.

Trading and financial systems

In trading, portfolio analytics, execution modeling, or risk forecasting, model instability can create inconsistent outputs across retrains. A model that is more stable across seeds can improve confidence in:

This does not guarantee trading profitability, but it does support stronger engineering reliability and more reproducible model behavior.

Healthcare and regulated environments

In healthcare and other regulated domains, reproducibility matters because stakeholders need confidence that retraining the same workflow will not produce materially unstable outcomes. Lower dispersion across seeds can help support validation, governance, and auditability.

Aerospace and autonomous systems

In mission-critical environments such as autonomous control or space-related analytics, reproducibility and reliability are essential. More stable training behavior can be valuable when models need to be trusted under constrained or high-stakes deployment settings.

Reproducibility notes

These results aggregate two independent 500-seed runs performed locally.

A master seed was not set for those original runs. Since then:

Keras raw and summary outputs are compiled in:

inst/scripts/vsKeras/1000SEEDSRESULTSvsKeras/1000seedsKeras.xlsx

Distributed execution (scaling note)

The results shown here were computed locally.

For large-scale experiments involving hundreds or thousands of seeds, DDESONN can be executed in distributed environments to reduce wall-clock time significantly. Distributed orchestration and development-stage scaling scripts are maintained in the GitHub repository and are intentionally excluded from the CRAN package so this vignette remains focused on validated results and benchmark methodology.