scROSHI

scROSHI identifies cell types based on expression profiles of single cell analysis by utilizing previously obtained cell type specific gene sets. It takes into account the hierarchical nature of cell type relationship and does not require training or annotated data. A detailed description of the method can be found at: Prummer et al. 2022. “scROSHI - robust supervised hierarchical identification of single cells”. bioRxiv. https://www.biorxiv.org/content/10.1101/2022.04.05.487176v1

Installation

You can install the development version from GitHub (required R version >= 3.6) with:

# install.packages("devtools")
devtools::install_github("ETH-NEXUS/scROSHI")

Example

This is a basic example for the scROSHI function

scROSHI requires three input objects:

sce_data

A SingleCellExperiment object containing the expression profiles of the single cell analysis

celltype_lists

Marker gene list for all cell types. It can be provided as a list of genes with cell types as names or as a path to a file containing the marker genes. Supported file formats are .gmt or .gmx files.

type_config

Config file to define major cell types and hierarchical subtypes. It should be provided as a two-column data.frame where the first column are the major cell types and the second column are the subtypes. If several subtypes exists they should be separated by comma.

library(scROSHI)
data("test_sce_data")
data("config")
data("marker_list")

results <- scROSHI(sce_data = test_sce_data,
                  celltype_lists = marker_list,
                  type_config = config)
table(results$celltype_final)
#> 
#>                      B.cells                B.cells.naive 
#>                            4                          180 
#>            B.cells.precursor              Dendritic.cells 
#>                           40                           40 
#>                    Monocytes                     NK.cells 
#>                          233                          219 
#>                 Plasma.cells Plasmacytoid.dendritic.cells 
#>                           13                           11 
#>                      T.cells                  T.cells.CD4 
#>                           60                          415 
#>                  T.cells.CD8                    uncertain 
#>                           76                           25