Understanding the IBGE Aggregate Data API

Introduction

The IBGE Aggregate Data API (version 3) is the programmatic interface behind SIDRA, IBGE’s automatic data retrieval system. It covers every survey and census produced by the Brazilian Institute of Geography and Statistics.

This vignette explains the API’s data model so you can make the most of ibger. If you’re familiar with OLAP terminology: variables = measures, classifications = dimensions, and categories = members.

Core concepts

Aggregates

An aggregate is a specific table of results from an IBGE survey. Each aggregate has a numeric ID that is stable over time. For example:

library(ibger)

# Search for aggregates
ibge_aggregates()

You can filter by periodicity, geographic level, subject, or classification:

# Only quarterly aggregates
ibge_aggregates(periodicity = "P10")

# Aggregates that have state-level data
ibge_aggregates(level = "N3")

Variables

Each aggregate exposes one or more variables — the measures being reported. Aggregate 1712 (crop production), for example, has:

ID Variable
214 Quantidade produzida (Production qty)
215 Valor da produção (Production value)
216 Área colhida (Harvested area)
1982 Quantidade vendida (Sold qty)
meta <- ibge_metadata(1712)
meta$variables

When calling ibge_variables(), you can request specific variables by ID:

# Two specific variables
ibge_variables(1712, variable = c(214, 1982), localities = "BR")

Use variable = NULL (default) for all standard variables, or variable = "all" to include API-generated percentage variables when available.

Classifications and categories

Besides being linked to a locality and a period, each observation can be further broken down by classifications (dimensions). Each classification contains categories (members).

For aggregate 1712, the classifications are things like “type of product” (226), “producer condition” (218), “economic activity group”, etc. Classification 226 has categories like “pineapple” (4844), “garlic” (96608), “potato” (96609), and hundreds more.

meta <- ibge_metadata(1712)
meta$classifications

# Unnest to see all categories
tidyr::unnest(meta$classifications, categories)

When you don’t specify a classification, the API returns results for the Total category (ID = 0). This is a special aggregate across all categories.

# Default: Total category (aggregated across all products)
ibge_variables(1712, localities = "BR")

# Specific products
ibge_variables(
  1712,
  localities     = "BR",
  classification = list("226" = c(4844, 96608))
)

# All products (can be large)
ibge_variables(
  1712,
  periods        = -1,
  localities     = "BR",
  classification = list("226" = "all")
)

Geographic levels and localities

IBGE organizes Brazil into a hierarchy of geographic levels. Each aggregate supports a specific subset of these levels:

Code Level Count Example
N1 Brazil 1 BR
N2 Major region 5 1 (North), 3 (Southeast)
N3 State (UF) 27 33 (RJ), 35 (SP)
N6 Municipality 5,570+ 3550308 (São Paulo city)
N7 Metropolitan area varies 3501 (RM São Paulo)
N9 Immediate region varies
N15 Intermediate region varies

Important: municipality IDs (N6) and metropolitan area IDs (N7) use different numbering. São Paulo city is 3550308 (N6), while the São Paulo metropolitan area is 3501 (N7). Don’t confuse them.

The available levels for each aggregate are in the metadata:

meta <- ibge_metadata(1705)
meta$territorial_level
#> $administrative
#> [1] "N1" "N2" "N3"

You can request all localities at a level, or pick specific ones:

# All states
ibge_variables(1705, localities = "N3")

# Specific states
ibge_variables(1705, localities = list(N3 = c(33, 35)))

The API also supports contextual queries — filtering municipalities by their parent state or region. For example, N6[N3[33,35],N2[1]] means “all municipalities in RJ, SP, or the North region”. ibger passes this through directly:

ibge_variables(
  512,
  variable   = 216,
  periods    = -6,
  localities = "N6[N3[33,35],N2[1]]"
)

Periods and periodicities

Each aggregate has a fixed periodicity:

Code Periodicity
P5 Monthly
P10 Quarterly
P13 Annual
P58 Semi-annual

Period codes encode both the date and periodicity. The code 202001 means different things depending on the aggregate’s periodicity:

The metadata tells you the valid range:

meta <- ibge_metadata(7060)
meta$periodicity
#> $frequency [1] "mensal"
#> $start     [1] "202001"
#> $end       [1] "202512"

ibger’s ibge_periods() lists every individual period:

ibge_periods(7060)

Request limits

The API allows at most 100,000 values per request. The formula is:

categories × periods × localities ≤ 100,000

For example, a request for aggregate 2654 with:

produces 1 × 2 × 2 × 1 × 6 × 4 = 96 values — well within the limit.

If your request exceeds 100,000 values, the API returns HTTP 500. Reduce the number of localities, periods, or categories and retry.

View modes

The API supports three view modes for the response format. ibger uses the default JSON mode, but you can also pass view = "OLAP" or view = "flat":

# OLAP notation
ibge_variables(1705, localities = "BR", view = "OLAP")

# Flat mode (first element is metadata, data starts at second)
ibge_variables(1705, localities = "BR", view = "flat")

In most cases, the default mode with ibger’s tidy output is the most convenient.

How ibger maps to the API

Here is a quick reference showing how ibger functions correspond to API endpoints:

ibger function API endpoint
ibge_aggregates() GET /agregados
ibge_metadata() GET /agregados/{id}/metadados
ibge_periods() GET /agregados/{id}/periodos
ibge_localities() GET /agregados/{id}/localidades/{nivel}
ibge_variables() GET /agregados/{id}/periodos/{p}/variaveis/{v}

The ibger parameters map to URL path segments and query parameters:

ibger parameter API parameter Format
aggregate {agregado} (path) Numeric ID
variable {variavel} (path) 214\|1982 or all or allxp
periods {periodos} (path) -6 or 201701-201706 or 201701\|201702
localities localidades (query) BR or N3 or N6[3550308,3304557]
classification classificacao (query) 226[4844,96608]\|218[4780]
view view (query) OLAP or flat

Further reading