| Version: | 0.1.0 |
| Title: | Inattention Detection Pipeline for Psychophysical Tasks |
| Description: | Three-stage pipeline for detecting inattention episodes in long psychophysical tasks (200+ trials). Uses accuracy residuals and response pattern signals to locate, sharpen, and formally test candidate inattention regions at trial-level precision. |
| License: | GPL (≥ 3) |
| URL: | https://github.com/pawlenartowicz/inough |
| BugReports: | https://github.com/pawlenartowicz/inough/issues |
| Encoding: | UTF-8 |
| Language: | en-US |
| Depends: | R (≥ 3.5.0) |
| LazyData: | true |
| RoxygenNote: | 7.3.3 |
| Imports: | lme4, ggplot2, patchwork, rlang, jsonlite |
| Suggests: | testthat (≥ 3.0.0) |
| Config/testthat/edition: | 3 |
| NeedsCompilation: | no |
| Packaged: | 2026-05-08 21:07:34 UTC; plenartowicz |
| Author: | Pawel Lenartowicz |
| Maintainer: | Pawel Lenartowicz <pawellenartowicz@europe.com> |
| Repository: | CRAN |
| Date/Publication: | 2026-05-13 19:10:02 UTC |
Apply spurious-start accuracy heuristic
Description
For each chunk, checks if the first n trials have suspiciously
high accuracy (>= k correct). If so, marks the chunk and shifts
the test start past those trials. Chunks with fewer than min_left
trials remaining are dropped.
Usage
apply_spurious_heuristic(chunks, trial, heuristics)
Arguments
chunks |
Data frame with |
trial |
Trial data frame (needs |
heuristics |
|
Value
Updated chunks data frame with added columns: test_start,
spurious_start.
Participant-level bail-out
Description
Flags entire sessions where trial-level analysis is unreliable: response sequence too stereotyped (low LZ) or accuracy indistinguishable from chance.
Usage
bailout(participant, lz_threshold)
Arguments
participant |
Data frame with |
lz_threshold |
LZ threshold (from |
Value
Data frame with id, reason, lz_value,
accuracy for bailed-out participants (zero rows if none).
Extract accuracy residuals via probit GLMM
Description
Fits a probit GLMM (lme4::glmer) or probit GLM (stats::glm)
on accuracy and returns Pearson residuals.
Usage
extract_accuracy(df, formula, has_random = TRUE)
Arguments
df |
Data frame with |
formula |
Full model formula (built by |
has_random |
Logical; whether the formula includes random effects. |
Value
A list with $residuals (Pearson) and $model.
Extract response bias signals
Description
Computes lag-1 response repetition indicator (per-trial) and normalized Lempel-Ziv complexity of the response sequence (per-participant).
Usage
extract_bias(df)
Arguments
df |
Data frame with columns |
Details
LZ is normalized by a permutation null distribution: 400 random 50/50 binary sequences at the modal participant length, giving expected value ~1.0 for random sequences regardless of sample size.
Value
A list with:
- trial
Data frame with column
resp_lag1(+1 = same as previous, -1 = different, 0 = first trial).- participant
Data frame with columns
idandlz(permutation-normalized LZ76 complexity).
Extract flagged trials
Description
Returns a data frame of all trials flagged by the detection pipeline,
suitable for downstream filtering via dplyr::anti_join or similar.
Usage
flags(x)
Arguments
x |
An |
Value
Data frame with columns id, trial_idx,
flag_type ("bailout" or "chunk"), chunk_id
(integer, NA for bailout), p_adj (numeric, NA for
bailout).
Pipeline tuning parameters
Description
Pipeline tuning parameters
Usage
inough_control(
lz_threshold = 0.2,
window_size = 3,
sd_threshold = 2,
window_weight = "uniform",
min_chunk = 6,
comparison = "clean"
)
Arguments
lz_threshold |
LZ complexity below this triggers bail-out (default 0.2). |
window_size |
Half-width of rolling window; total window = 2*w+1 (default 3, giving 7-trial windows). |
sd_threshold |
Number of SDs above chance that |
window_weight |
Weighting scheme for the rolling window:
|
min_chunk |
Minimum chunk length in trials to retain after merging (default 6). |
comparison |
t-test comparison set: |
Value
An inough_control object with a pre-computed
screening_threshold based on sd_threshold and the window.
Detect inattention episodes
Description
Two-stage pipeline: (1) dual-track screening with chunk filtering, (2) formal t-test with FDR correction. Participants with extremely stereotyped responses or chance-level accuracy are bailed out first.
Usage
inough_detect(
signals,
fdr_alpha = 0.2,
control = inough_control(),
heuristics = inough_heuristics()
)
Arguments
signals |
An |
fdr_alpha |
FDR significance level for BH correction (default 0.05). |
control |
An |
heuristics |
An |
Value
An inough_detected object.
Post-detection heuristics
Description
Configures optional refinements applied after chunk screening: boundary extension mode and spurious-accuracy trimming.
Usage
inough_heuristics(
boundary_mode = "heuristic",
spurious = TRUE,
spurious_n = 6L,
spurious_k = NULL,
min_left = 7L
)
Arguments
boundary_mode |
How to extend chunk boundaries after detection:
|
spurious |
Logical; enable spurious-start accuracy trimming
(default |
spurious_n |
Number of trials at chunk start to inspect (default 6). |
spurious_k |
Explicit threshold: flag if |
min_left |
Minimum trials remaining after spurious trimming (default 7). Chunks shorter than this after trimming are dropped. |
Value
An inough_heuristics object.
Extract inattention signals
Description
Fits a probit GLMM on accuracy and computes response bias indicators.
This is the first step in the inough pipeline: call inough_signals,
then pass the result to inough_detect.
Usage
inough_signals(
df,
correct,
response,
id = "ID",
learning_effect = TRUE,
participant_effect = TRUE,
trial_transform = "sqrt"
)
Arguments
df |
A data frame with rows ordered by trial within each participant. |
correct |
Formula. LHS names the accuracy column (0/1 integer). RHS
names design predictors that explain correctness (e.g.,
|
response |
Formula identifying the response column. Use a two-sided
formula where the RHS names the column (e.g., |
id |
String naming the participant identifier column (default
|
learning_effect |
Logical. If |
participant_effect |
Logical. If |
trial_transform |
Transformation applied to trial index before rescaling
to [-1, 1]. One of |
Value
An inough_signals object.
Normalized Lempel-Ziv Complexity (LZ76)
Description
Computes the normalized LZ76 complexity of a binary sequence. Higher values indicate more random/complex sequences; lower values indicate more predictable/repetitive patterns.
Usage
lz_complexity(x)
Arguments
x |
Integer vector of 0s and 1s. |
Value
Numeric scalar in \[0, 1\]. Normalized complexity where 1 = maximally complex (random) and values near 0 = highly predictable.
Examples
lz_complexity(c(0, 0, 0, 0, 0)) # low
lz_complexity(c(0, 1, 0, 1, 0, 1)) # low-medium
lz_complexity(sample(0:1, 100, TRUE)) # near 1
Sort and merge overlapping regions
Description
Sort and merge overlapping regions
Usage
merge_regions(regions)
Plot per-participant diagnostic panels
Description
Produces a stacked 4-panel visualization: accuracy strip, accuracy residuals, lag-1 response, and dual-track z-scores. Flagged chunks are highlighted as red shaded regions.
Usage
## S3 method for class 'inough_detected'
plot(x, id, ...)
Arguments
x |
An |
id |
Character scalar — participant ID to plot. |
... |
Ignored. |
Value
A patchwork object (invisibly).
Generate interactive HTML report
Description
Creates a self-contained HTML file with a participant browser, diagnostic plots, chunk details, and summary statistics.
Usage
report(x, ...)
## S3 method for class 'inough_detected'
report(x, file = NULL, custom_plot = NULL, ...)
Arguments
x |
An |
... |
Arguments passed to methods. |
file |
Output file path. If |
custom_plot |
Optional per-trial variable to show as an extra panel in the participant view. A list with three fields:
|
Value
Invisibly returns the file path.
Contiguous TRUE regions to start/end data.frame
Description
Contiguous TRUE regions to start/end data.frame
Usage
rle_regions(above, trial_idx)
Robust z-score (median / MAD); returns NULL if MAD = 0
Description
Robust z-score (median / MAD); returns NULL if MAD = 0
Usage
robust_zscore(x)
Centered weighted rolling mean
Description
weights is the full weight vector of length 2*k+1 (center at
position k+1). At the edges the weight vector is cropped and
renormalized so the SD formula stays valid.
Usage
rolling_wmean(x, weights)
Stage 1: Candidate Screening
Description
Flags regions where the absolute rolling mean of the lag-1 response
exceeds the per-participant threshold: |roll_resp| > threshold.
Catches both repetition (positive) and switching (negative) stereotypy
with a single track. Overlapping regions are merged; chunks shorter than
min_chunk are discarded. Boundary extension mode is controlled by
heuristics$boundary_mode.
Usage
screen(trial, participant, bail_ids, control, heuristics)
Arguments
trial |
Trial data frame from |
participant |
Participant data frame from |
bail_ids |
Character vector of bailed-out participant IDs. |
control |
|
heuristics |
|
Value
A list with $candidates (raw threshold crossings) and
$chunks (after merge + min-length filter + boundary extension).
Example dual-task data with diverse inattention profiles
Description
A minimal anonymized subset of the Dual Task (Gabor orientation under motor interference) dataset, intended for demonstrating the inough pipeline. Twenty participants were sampled to span a range of attention profiles: clean performers, two bail-out cases (one for response stereotypy, one for chance-level accuracy), participants with localized inattention chunks, and participants with extended inattention periods.
Usage
task_example
Format
A data frame with one row per trial and the following columns:
- participant
Anonymized participant identifier (factor,
P01–P20).- block
Block index within the session (integer,
>= 1; practice block excluded).- trial
Trial index within the participant's session (integer).
- stim
Stimulus identifier (integer).
- weight
Stimulus weight / contrast (numeric).
- orient
Gabor orientation code (integer).
- cue_type
Cue type code (integer).
- response
Participant's response (integer, two unique values).
- correct
Trial accuracy (integer, 0 or 1).
Details
Participant identifiers have been replaced with arbitrary codes
(P01–P20) and any session timing information has been
removed.
Source
A subset of the Dual Task (s_9) data collected in the
COST/Kraken consciousness study (Krakow site). Participant IDs have
been re-coded for anonymity.
Examples
data(task_example)
head(task_example)
signals <- inough_signals(
task_example,
correct = correct ~ stim + weight + orient + cue_type + block,
response = response ~ response,
id = "participant"
)
det <- inough_detect(signals)
summary(det)
Stage 3: Formal Test
Description
Welch t-test on accuracy residuals (chunk vs comparison set) with
LZ-informed local FDR. When test_start column is present (from
spurious-accuracy trimming), uses it for inside trials while still
excluding the full chunk from the comparison set.
Usage
test_chunks(chunks, trial, participant, fdr_alpha, comparison)
Arguments
chunks |
Data frame from Stage 2 ( |
trial |
Trial data frame from |
participant |
Participant data frame from |
fdr_alpha |
Significance threshold for local FDR (default 0.05).
A chunk is flagged when |
comparison |
|
Details
Local FDR is computed per chunk:
lfdr = \pi_0 f_0(t) / [\pi_0 f_0(t) + \pi_1 f_1(t)]
where \pi_0 = \min(1, LZ) is the LZ-informed null prior,
f_0 is the central t density (null), and f_1 is
a non-central t density with noncentrality parameter reflecting the
expected accuracy drop to chance.
Value
Data frame with id, start, end, t_stat,
df, p_raw, p_adj, lfdr, effect_size,
significant.