--- title: "Getting started with aieconindex" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Getting started with aieconindex} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE ) ``` # Getting started `aieconindex` provides tidy R access to the [Anthropic Economic Index](https://www.anthropic.com/economic-index) (AEI) dataset hosted on Hugging Face. The AEI is a recurring release from Anthropic that maps usage of the Claude family of large language models to occupations and tasks using the O*NET taxonomy and the Standard Occupational Classification (SOC) system. ## Installation ```{r} # install.packages("remotes") remotes::install_github("charlescoverdale/aieconindex") library(aieconindex) ``` ## Discovering releases The Anthropic Economic Index is released as dated snapshots (typically every few months). `aei_releases()` lists the snapshots available on Hugging Face. ```{r} aei_releases() ``` You can fetch the file tree of any single release with `aei_files()`. This is useful when you want to know exactly what is available before downloading anything. ```{r} aei_files("2025-03-27", recursive = FALSE) ``` ## Fetching a usage table `aei_index()` is the convenience wrapper for the canonical usage table of a release. The shape and exact filename of that table varies across releases (the AEI restructured its directory layout in late 2025); this function papers over that variation. ```{r} df <- aei_index("latest", source = "claude_ai", variant = "raw") head(df) ``` The `source` argument selects between Claude.ai consumer traffic (`"claude_ai"`) and first-party API traffic (`"1p_api"`). The `variant` argument selects between raw counts (`"raw"`) and tables already enriched with O*NET and SOC metadata (`"enriched"`). ## Fetching arbitrary files For files that aren't covered by `aei_index()`, use `aei_download()`: ```{r} soc <- aei_download("2025-03-27", "SOC_Structure.csv") hierarchy <- aei_download("2025-09-15", "data/output/request_hierarchy_tree_claude_ai.json") ``` CSV files come back as data frames; JSON files come back as parsed lists. ## The aei_tbl class All data-returning functions emit an object of class `aei_tbl`: a `data.frame` with provenance metadata stored in the `aei_query` attribute. Inspect it directly: ```{r} attr(df, "aei_query") ``` The class also dispatches a custom `print()`, `summary()`, and `[` method that preserves the metadata when the table is subset. ## Caching Downloaded files are cached under `tools::R_user_dir("aieconindex", "cache")`. Override with `options(aieconindex.cache_dir = "/your/path")` before the first call. Inspect the cache with `aei_cache_info()` and clear it with `aei_cache_clear()`. ## Citing the data The Anthropic Economic Index dataset is released under Creative Commons Attribution 4.0 International (CC-BY-4.0). When you use the data in published work, cite it. ```{r} aei_cite("2026-03-24", format = "bibtex") ``` `aei_cite()` accepts `format = c("text", "bibtex", "bibentry")` and either a release id or `"all"` (the default) to cite the project as a whole.