--- title: "Introduction to otTensor" author: - name: Koki Tsuyuzaki affiliation: Laboratory for Bioinformatics Research, RIKEN Center for Biosystems Dynamics Research email: k.t.the-answer@hotmail.co.jp date: "`r Sys.Date()`" bibliography: bibliography.bib package: otTensor output: rmarkdown::html_vignette vignette: | %\VignetteIndexEntry{Introduction to otTensor} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE, fig.width = 6, fig.height = 4) ``` # What is Optimal Transport? Imagine you have two piles of sand with different shapes, and you want to reshape one pile to match the other. **Optimal transport (OT)** finds the cheapest way to move the sand --- that is, the plan that minimizes the total cost of moving mass from one distribution to another. In the simplest case, the two distributions are one-dimensional histograms. The "transport plan" is a matrix that tells you how much mass to move from each bin of the source to each bin of the target. ```{r ot-intuition, fig.height=5} # Two simple 1D distributions source_dist <- c(0.4, 0.1, 0.4, 0.1) target_dist <- c(0.1, 0.3, 0.1, 0.3, 0.2) oldpar <- par(mfrow = c(1, 2), mar = c(4, 4, 3, 1)) barplot(source_dist, col = "steelblue", main = "Source distribution", names.arg = seq_along(source_dist), ylim = c(0, 0.5), xlab = "Bin", ylab = "Mass") barplot(target_dist, col = "tomato", main = "Target distribution", names.arg = seq_along(target_dist), ylim = c(0, 0.5), xlab = "Bin", ylab = "Mass") par(mfrow = oldpar) ``` OT finds a transport plan (a matrix) that optimally maps the source to the target. Each cell of the matrix represents how much mass moves from a source bin to a target bin. # What is a Tensor? A **tensor** is simply a generalization of familiar data structures: - **Order 1 (vector)**: a list of numbers, e.g., temperature readings over time - **Order 2 (matrix)**: a table of numbers, e.g., pixels in a grayscale image - **Order 3 and higher**: a "cube" or higher-dimensional array, e.g., a color image (height x width x RGB channels) ```{r tensor-illustration, fig.height=4} oldpar <- par(mfrow = c(1, 3), mar = c(2, 2, 3, 1)) # Vector (order 1) barplot(c(3, 1, 4, 1, 5), col = "steelblue", main = "Order 1: Vector") # Matrix (order 2) mat <- matrix(c(1, 2, 3, 4, 2, 3, 4, 5, 3, 4, 5, 6), nrow = 3) image(mat, col = gray((0:255) / 255), axes = FALSE, main = "Order 2: Matrix") # 3D tensor (show one slice) arr <- array(0, dim = c(3, 4, 2)) arr[,,1] <- matrix(c(1, 2, 3, 4, 2, 3, 4, 5, 3, 4, 5, 6), nrow = 3) arr[,,2] <- matrix(c(6, 5, 4, 3, 5, 4, 3, 2, 4, 3, 2, 1), nrow = 3) image(arr[,,1], col = gray((0:255) / 255), axes = FALSE, main = "Order 3: Tensor\n(slice 1)") par(mfrow = oldpar) ``` # The Problem OTT Solves Standard OT works well for vectors and matrices, but what if your data is a higher-order tensor? **Optimal Tensor Transport (OTT)** [@ott] extends OT to tensors of any order. Given two tensors $X$ and $Y$ of the same order, OTT finds transport plans --- one or more matrices that describe how to map each dimension of $X$ to the corresponding dimension of $Y$. # The Key Concept: the `f` Parameter The `f` parameter is the core idea that makes OTT flexible. It is a vector that assigns each dimension to a **transport plan group**. This controls how dimensions share transport plans. | Setting | Meaning | Analogy | |---------|---------|---------| | `f = c(1, 2)` | Each dimension gets its own transport plan | Co-Optimal Transport | | `f = c(1, 1)` | Both dimensions share the same plan | Gromov-Wasserstein-like | | `f = c(1, 1, 2)` | Dims 1 & 2 share a plan; dim 3 has its own | GW collections | For example, with a 3D tensor (e.g., subjects x genes x time): - `f = c(1, 2, 3)` learns separate transport plans for subjects, genes, and time - `f = c(1, 1, 2)` forces subjects and genes to share a plan, while time has its own # Quick Start Example Here we walk through a minimal example step by step. ```{r quickstart, message=FALSE} library("otTensor") library("rTensor") ``` ## Step 1: Create two tensors We create two small matrices (order-2 tensors) as source and target. ```{r create-tensors} # Source: a 4 x 5 matrix arrX <- matrix(0, nrow = 4, ncol = 5) for (i in 1:4) { for (j in 1:5) { arrX[i, j] <- i + j } } # Target: a 6 x 7 matrix (different size is OK) arrY <- matrix(0, nrow = 6, ncol = 7) for (i in 1:6) { for (j in 1:7) { arrY[i, j] <- i + j } } # Convert to Tensor objects X <- as.tensor(arrX) Y <- as.tensor(arrY) ``` ## Step 2: Choose the `f` parameter Since this is an order-2 tensor with 2 dimensions, we set `f = c(1, 2)` so that each dimension gets its own transport plan. ```{r set-f} f <- c(1, 2) ``` ## Step 3: Run OTT ```{r run-ott} result <- OTT(X = X, Y = Y, f = f, num.sample = 500, num.iter = 100) ``` ## Step 4: Inspect the results The result contains a list of transport plan matrices `Ts`. Since `f = c(1, 2)`, there are two plans: - `Ts[[1]]`: maps rows of X (size 4) to rows of Y (size 6) - `Ts[[2]]`: maps columns of X (size 5) to columns of Y (size 7) ```{r inspect-results} # Transport plan dimensions cat("Transport plan 1:", dim(result$Ts[[1]]), "\n") cat("Transport plan 2:", dim(result$Ts[[2]]), "\n") ``` ```{r visualize-results, fig.height=5, fig.width=6} .show_matrix <- function(mat, main = "") { mat_rev <- t(apply(mat, 2, rev)) image(mat_rev, col = gray((0:255) / 255), xaxt = "n", yaxt = "n", xlab = "", ylab = "", axes = FALSE, main = main) } oldpar <- par(mfrow = c(2, 2), mar = c(2, 2, 3, 1)) .show_matrix(arrX, main = "Source (X)") .show_matrix(arrY, main = "Target (Y)") .show_matrix(result$Ts[[1]], main = "Transport Plan 1\n(rows)") .show_matrix(result$Ts[[2]], main = "Transport Plan 2\n(columns)") par(mfrow = oldpar) ``` Each transport plan is a matrix where brighter cells indicate more mass being transported between the corresponding indices. # Parameter Reference | Parameter | Description | Default | |-----------|-------------|---------| | `X` | Source tensor (`rTensor::Tensor` object) | (required) | | `Y` | Target tensor (same order as X, sizes may differ) | (required) | | `f` | Integer vector assigning each dimension to a transport plan group | (required) | | `ps` | List of source marginal distributions (one per unique value in `f`) | Uniform | | `qs` | List of target marginal distributions | Uniform | | `loss` | Loss function for computing costs | Absolute error | | `num.sample` | Number of Monte Carlo samples for gradient estimation | 1000 | | `num.iter` | Number of optimization iterations | 200 | | `epsilon` | Convergence threshold | 1e-10 | **Tips:** - Larger `num.sample` gives more accurate gradients but is slower - Start with smaller `num.iter` (e.g., 50) for exploration, increase for final results - Custom loss functions can be passed, e.g., `loss = function(x, y) (x - y)^2` for squared error # What's Next? The next vignette (**otTensor-2: Optimal Tensor Transport**) reproduces the experiments from the original paper [@ott], demonstrating OTT under all six `f` configurations: - OTT_1 --- standard OT (order-1 tensors) - OTT_12 --- Co-OT (each dimension independent) - OTT_11 --- Gromov-Wasserstein-like (shared plan) - OTT_111 --- Triplets (order-3, single shared plan) - OTT_123 --- tri-Co-OT (order-3, all independent) - OTT_112 --- GW collections (mixed sharing) # Session Information {.unnumbered} ```{r sessionInfo, echo=FALSE} sessionInfo() ``` # References