--- title: "Attributes charts: p, np, c, u" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Attributes charts: p, np, c, u} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set(collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 4.2) ``` ```{r setup, message = FALSE} library(shewhartr) ``` When the quality characteristic is binary (defective / non-defective) or a count of defects per unit, classical variables charts are the wrong tool. Counts and proportions live on bounded supports and follow Binomial / Poisson distributions; pretending they are normal makes the chart limits wrong, sometimes badly. | Data | Distribution | Chart | |------------------------------------------|--------------|------------------------| | Proportion defective (variable n) | Binomial | `shewhart_p()` | | Number defective (constant n) | Binomial | `shewhart_np()` | | Defect count per unit (constant exposure)| Poisson | `shewhart_c()` | | Defect count per unit (variable exposure)| Poisson | `shewhart_u()` | ## p chart with variable n `claims_p` records 30 days of insurance-claim quality control. Each day, a variable number of claims (`n`) is processed and a count of errors (`defects`) is observed. ```{r} fit <- shewhart_p(claims_p, defects = defects, n = n, index = day) broom::tidy(fit) ``` Because `n` varies day-to-day, the limits also vary day-to-day: ```{r} broom::augment(fit) |> head(10) ``` The default `limits = "3sigma"` uses the normal approximation $\bar p \pm 3\sqrt{\bar p (1 - \bar p)/n_i}$. This is fine when $n_i \bar p \gtrsim 5$ and $n_i (1-\bar p) \gtrsim 5$. For small $n$ or extreme $\bar p$, switch to exact binomial limits: ```{r, eval = FALSE} shewhart_p(claims_p, defects = defects, n = n, index = day, limits = "binomial") ``` ## c chart and Poisson honesty `pcb_solder` has 50 PCBs and a mean defect count of about 6. The default 3-sigma c-chart works fine here: ```{r} fit_c <- shewhart_c(pcb_solder, defects = defects, index = board) broom::tidy(fit_c) ``` But if `c_bar` were small (say 2 or 3), the lower limit under the normal approximation would be negative — which makes no sense for a count. The package warns when this is likely: ```{r} small_means <- data.frame(unit = 1:50, defects = rpois(50, lambda = 2)) suppressWarnings( fit_low <- shewhart_c(small_means, defects = defects, index = unit) ) broom::tidy(fit_low) ``` For low-mean Poisson processes, use exact quantile limits: ```{r} fit_low_exact <- shewhart_c(small_means, defects = defects, index = unit, limits = "poisson") broom::tidy(fit_low_exact) ``` George Box's advice — *don't transform if you can model the right distribution* — applies. The exact Poisson limits use $q(0.99865)$ and $q(0.00135)$ of $\mathrm{Poisson}(\bar c)$, the same coverage probability as classical 3-sigma limits but without the normal approximation. ## np chart for constant n When subgroup size is constant, the np chart plots the *count* rather than the proportion. Useful for direct interpretation when n is a round number: ```{r} fit_np <- shewhart_np( data.frame(day = 1:30, defects = rbinom(30, size = 200, prob = 0.04)), defects = defects, n = 200, index = day ) broom::tidy(fit_np) ``` ## u chart for variable exposure When the inspection size differs (e.g. fabric rolls of different length, machine-hours of different duration), the right chart is u — defects per unit of exposure: ```{r} set.seed(1) df_u <- data.frame( roll = 1:25, defects = rpois(25, lambda = 4 * runif(25, 0.5, 1.5)), m2 = runif(25, 0.5, 1.5) ) fit_u <- shewhart_u(df_u, defects = defects, exposure = m2, index = roll) broom::tidy(fit_u) ``` ## References - Montgomery, D. C. (2019). *Introduction to Statistical Quality Control* (8th ed.). Wiley. Chapter 7. - Ryan, T. P. (2011). *Statistical Methods for Quality Improvement* (3rd ed.). Wiley. (On the inadequacy of 3-sigma limits for low-mean Poisson counts.) - Box, G. E. P., Hunter, W. G., & Hunter, J. S. (2005). *Statistics for Experimenters* (2nd ed.). Wiley.