---
title: "Organization"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Organization}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
```
HDF5 files are best understood as "file systems within a file." Just as your computer has folders and files, an HDF5 file has **Groups** (folders) and **Datasets** (files). This hierarchical structure allows you to organize complex experimental data, metadata, and configuration settings into a single, self-describing package.
This vignette explains how to create, manage, and modify this structure using `h5lite`.
```{r setup}
library(h5lite)
file <- tempfile(fileext = ".h5")
```
## The Hierarchical Model
HDF5 uses POSIX-style paths (like Linux or macOS) to identify objects. The root of the file is `/`.
* `/` : The Root Group
* `/experiment_1` : A Group (folder)
* `/experiment_1/data` : A Dataset (file) inside the group
## Creating Groups
### Implicit Creation (Recommended)
In most cases, you do not need to create groups manually. When you write a dataset to a path like `"data/experiment/run1"`, `h5lite` automatically creates the parent groups `"data"` and `"data/experiment"` if they do not exist.
### Explicit Creation
If you need to create an empty group structure (perhaps to add attributes to it), you can use `h5_create_group()`. This function works like `mkdir -p`: it creates all necessary parent groups.
```{r}
# Create a deep hierarchy
h5_create_group(file, "project_A/simulation/run_01")
# Verify
h5_str(file)
```
## Using Lists as Groups
The most powerful way to organize data in `h5lite` is by mapping R **lists** to HDF5 **groups**.
When you pass a named list to `h5_write()`, `h5lite` recursively writes the list structure to the file.
* **Named Lists** become **Groups**.
* **Atomic Vectors/Matrices** inside the list become **Datasets**.
This allows you to organize your entire data structure in R and save it to disk in one command.
```{r}
# Define a complex structure in R
experiment_data <- list(
metadata = list(
id = I(101),
technician = I("Dr. Smith"),
timestamp = I("2023-10-27")
),
measurements = list(
raw = runif(10),
calibration = c(0.1, 0.9)
),
status = I("complete")
)
# Write the entire structure to a group named "exp_101"
h5_write(experiment_data, file, "exp_101")
```
## Inspecting Structure
You can visualize the organization of your file using `h5_ls()` and `h5_str()`.
* `h5_ls()`: Returns a character vector of names. Useful for programmatic checks.
* `h5_str()`: Prints a tree diagram. Useful for interactive exploration.
```{r}
# List all objects recursively
h5_ls(file)
# Visualize the tree
h5_str(file)
```
## Moving and Renaming
Data organization often changes. You can rename objects or move them to different groups using `h5_move()`.
This operation is metadata-only, meaning it is extremely fast even for large datasets, as the data itself is not rewritten.
```{r}
# Rename 'exp_101' to 'archive_101'
h5_move(file, "exp_101", "archive_101")
# Move 'project_A' inside 'archive_101'
h5_move(file, "project_A", "archive_101/project_A")
h5_ls(file)
```
## Deleting Objects
You can remove groups or datasets using `h5_delete()`.
* Deleting a dataset removes the data.
* Deleting a group removes the group **and all of its children** (recursively).
* The file size does not change, but the freed space can be reused.
```{r}
# Delete the entire archive group
h5_delete(file, "archive_101")
# The file is now empty (except for the root)
h5_ls(file)
```
```{r, include=FALSE}
unlink(file)
```