--- title: "Getting Started with deprivateR" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Getting Started with deprivateR} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Overview `deprivateR` provides a unified framework for calculating measures of area-level deprivation in the United States. These measures are commonly used in social determinants of health research to quantify neighborhood disadvantage. The package supports the following indices: - **Area Deprivation Index (ADI)** (`"adi"`) - a factor-based measure of socioeconomic deprivation (via `sociome`) - **Gini Coefficient** (`"gini"`) - a measure of income inequality (via `tidycensus`) - **Neighborhood Deprivation Index, Messer** (`"ndi_m"`) - a factor-based deprivation measure (via `ndi`) - **Neighborhood Deprivation Index, Powell-Wiley** (`"ndi_pw"`) - an alternative NDI formulation (via `ndi`) - **Social Vulnerability Index (SVI)** (`"svi10"`, `"svi14"`, `"svi20"`, `"svi20s"`) - the CDC's composite vulnerability measure, with four methodology variants Data can be retrieved at the county, census tract, ZCTA5, or ZCTA3 level for years 2010 through 2022. ## Setup ### Installation The easiest way to install `deprivateR` is from CRAN: ```{r install-cran, eval = FALSE} install.packages("deprivateR") ``` Alternatively, you can install `deprivateR` from GitHub: ```{r install-gh, eval = FALSE} # install.packages("remotes") remotes::install_github("pfizer-opensource/deprivateR") ``` ### Census API Key To download data from the Census Bureau, you need a free API key. You can request one at . Once you have your key, store it for use with `tidycensus`: ```{r api-key, eval = FALSE} tidycensus::census_api_key("YOUR_KEY_HERE", install = TRUE) ``` This saves the key to your `.Renviron` file so it is available across sessions. ## Quick Start with Sample Data The package includes sample data so you can explore functionality without an API key. The sample data contains 2022 ACS 5-year estimates for all 115 counties in Missouri. ```{r load} library(deprivateR) ``` ### Load and Calculate an Index ```{r sample-calc} # load sample data for the Messer NDI ndi_data <- dep_sample_data(index = "ndi_m") # calculate the index ndi_results <- dep_calc_index( ndi_data, geography = "county", index = "ndi_m", year = 2022, return_percentiles = TRUE ) # view the results ndi_results[, c("GEOID", "NAME", "NDI_M")] ``` The `NDI_M` column contains the calculated Neighborhood Deprivation Index scores. Higher values indicate greater deprivation. ### Quantiles for Analysis To use deprivation scores as categorical variables in statistical models, you can split them into quantiles: ```{r quantiles} # split NDI into quartiles ndi_results <- dep_quantiles( ndi_results, source_var = NDI_M, new_var = ndi_quartile, n = 4L, return = "label" ) # view the distribution table(ndi_results$ndi_quartile) ``` ### Map Breaks for Visualization To create choropleth maps, use `dep_map_breaks()` to calculate appropriate classification breaks: ```{r map-breaks} # calculate Fisher-Jenks breaks with 5 classes ndi_results <- dep_map_breaks( ndi_results, var = "NDI_M", new_var = "map_class", classes = 5, style = "fisher" ) # view the break labels levels(ndi_results$map_class) ``` You can also specify manual breaks: ```{r manual-breaks} # define custom break points my_breaks <- c( min(ndi_results$NDI_M, na.rm = TRUE), 25, 50, 75, max(ndi_results$NDI_M, na.rm = TRUE) ) # apply manual breaks ndi_results <- dep_map_breaks( ndi_results, var = "NDI_M", new_var = "map_class_manual", breaks = my_breaks ) levels(ndi_results$map_class_manual) ``` ## Downloading Data with dep_get_index() When you have a Census API key configured, `dep_get_index()` handles the full workflow of downloading raw data and computing indices in one step: ```{r get-index, eval = FALSE} # download and calculate SVI for Missouri tracts mo_svi <- dep_get_index( geography = "tract", index = "svi20", year = 2020, state = "MO" ) ``` ### Multiple Indices at Once You can request multiple indices in a single call: ```{r multi-index, eval = FALSE} # calculate ADI and Gini together for Missouri counties mo_multi <- dep_get_index( geography = "county", index = c("adi", "gini"), year = 2022, state = "MO" ) ``` ### Spatial Output for Mapping Set `output = "sf"` to get results as an `sf` object with geometry attached, ready for mapping with `ggplot2` or `leaflet`: ```{r sf-output, eval = FALSE} # get SVI with geometry for mapping mo_svi_sf <- dep_get_index( geography = "tract", index = "svi20", year = 2020, state = "MO", output = "sf" ) # plot with ggplot2 library(ggplot2) ggplot(mo_svi_sf) + geom_sf(aes(fill = SVI20), color = NA) + scale_fill_viridis_c(direction = -1) + theme_void() + labs(title = "Social Vulnerability Index, Missouri Tracts (2020)") ``` ### Subscales and Components For deeper analysis, you can retain subscales and the underlying component variables: ```{r subscales, eval = FALSE} # keep SVI theme subscales and all component variables mo_detailed <- dep_get_index( geography = "county", index = "svi20", year = 2020, state = "MO", keep_subscales = TRUE, keep_components = TRUE ) ``` ## Two-Step Workflow For more control, you can separate data retrieval from calculation. This is useful when you want to inspect or modify the raw data before computing scores: ```{r two-step, eval = FALSE} # step 1: build the variable list and download data library(tidycensus) vars <- dep_build_varlist( geography = "county", index = "ndi_m", year = 2022 ) raw_data <- get_acs( geography = "county", variables = vars, year = 2022, state = "MO", output = "wide" ) # step 2: calculate the index on your data results <- dep_calc_index( raw_data, geography = "county", index = "ndi_m", year = 2022 ) ``` ## Summary of Key Functions | Function | Purpose | |----------|---------| | `dep_get_index()` | Download data and calculate indices (one step) | | `dep_calc_index()` | Calculate indices on existing data | | `dep_build_varlist()` | Get the Census variable names needed for an index | | `dep_sample_data()` | Load bundled sample data (no API key required) | | `dep_quantiles()` | Split scores into quantile categories | | `dep_percentiles()` | Calculate percentile ranks | | `dep_map_breaks()` | Create classification breaks for choropleth maps | ## Further Resources - [Package documentation site](https://pfizer-opensource.github.io/deprivateR/) - [GitHub repository](https://github.com/pfizer-opensource/deprivateR) - [Census API key signup](https://api.census.gov/data/key_signup.html)