---
title: "Getting Started"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Getting_started}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup}
library(siera)
```

We will use two examples to get started:

1. Using the original *CDISC Common Safety Displays* ARS metadata (JSON format, with example operations handled in the `readARS` function)
2. Using a modified *CDISC Common Safety Displays* ARS metadata (xlsx format, with example dynamic AnalysisMethodCodeTemplate R code handling operations, based on functions from the `cards` package)

In order to facilitate the examples, _siera_ includes several example files, which we use throughout the documentation.  These include a JSON ARS file, as well as some csv ADaMs (ADSL and ADAE) which can be run with the R scripts produced by readARS function. Use the helper ARS_example() with no arguments to list them or call it with an example filename to get the path.

```{r example files}
# To see a list of example files:
ARS_example()

# A temporary path to a specific file:
ARS_example("ARS_V1_Common_Safety_Displays.json")
```


## CDISC example: JSON format

To get started with an example of ingesting the ARS JSON metadata,we will ingest the example JSON ARS file to meta-programme ready-to-run R scripts, 
which will produce the ARDs. 
```{r example, message=FALSE}
# Path to the the ARS JSON File. 
json_path <- ARS_example("ARS_V1_Common_Safety_Displays.json")

# Path to a folder which will contain the meta-programmed R scripts (feel free to update 
# to a more suitable path)
output_folder <- tempdir()

# this folder contains ADaM datasets to produce ARD (we will use temporary 
# directory tempdir(), but feel free to download the ADaMs required and use the location they are stored in.
# This can be done with e.g. dirname(ARS_example("ADSL.csv"))
ADaM_folder <- tempdir()

# run the readARS function with these 3 parameters.  This creates R scripts (1 for each output in output_folder)
readARS(json_path, output_folder, ADaM_folder, example = TRUE)
```

Once the R programs are created, they can be individually run, provided that the ADaM datasets are in the location as provided to the readARS function.  

## CDISC example: xlsx format, with `cards` functions to handle operations

Before starting out with this example, it would be helpful to first have a discussion on the composition of the ARS metadata (the xlsx representation, in particular), and the use of `cards` functions.  

### Reading in Excel ARS metadata - what to expect

The ARS metadata represented in Excel contains all ARS information for the Reporting Event.  It is divided into Excel Worksheets, roughly mapped/equivalent to the official JSON ARS representation and its classes (an example can be found [here](https://github.com/cdisc-org/analysis-results-standard/blob/main/workfiles/examples/ARS%20v1/Common%20Safety%20Displays.json)).  For the purpose of automating Analysis Results using 
_siera_, a subset of the multiple Excel Worksheets contain the key information.  These include:

- _MainListOfContents_ (links outputs to their respective analyses)
- _OtherListsOfContents_ (contains Output metadata)
- _DataSubsets_ (filters data for individual analyses)
- _AnalysisSets_ (filters data to get the Population Set for the Output)
- _AnalysisGroupings_ (groups data according to analysis specifications)
- _Analyses_ (contains linking information to all components required for calculation of results)
- _AnalysisMethods_ (describes operations to be performed on data to get results)
- _AnalysisMethodCodeTemplate_ (contains dynamically written R code to perform AnalysisMethod)
- _AnalysisMethodCodeParameters_ (populates AnalysisMethodCodeTemplate with Analysis information)

### A note on example code used in AnalysisMethodCodeTemplate: using "cards" package

A powerful aspect of the ARS model, is that dynamic code can be supplied as part of the metadata and be evaluated on either 1. the Output level, 2. the Analysis level, or 3. the AnalysisMethod level (for more information on this, see [this](https://www.lexjansen.com/phuse-us/2025/ds/PAP_DS09.pdf) PHUSE paper).  The latter case is implemented by making use of the _AnalysisMethodCodeTemplate_ and _AnalysisMethodCodeParameters_ classes in the metadata.  The user supplies dynamic R code (with the option of making use of parameters referring to metadata pieces), which will be executed to perform the relevant AnalysisMethod operations.  

For the example ARS metadata in Excel shipped with this package, the R code used in the _AnalysisMethodCodeTemplate_ is aligned with the _cards_ R package (more information [here](https://insightsengineering.github.io/cards/latest-tag/)).  The _cards_ package is a reputable R package consisting of wrappers for various statistical functions, and producing results in a consistent ARD format. Below is a simple example of how a function _ard_continuous_ from the _cards_ package is used in the _AnalysisMethodCodeTemplate_ class in the accompanying Excel ARS metadata example shipped with this package:

```{r cards example, eval=FALSE}
# example of 'cards' code in AnalysisMethodCodeTemplate, using the ard_continuous function:

Analysis_ARD = ard_continuous(
  data = filtered_ADSL,
  by = c(byvariables_here),
  variables = analysisvariable_here
)
```

In the above example of _AnalysisMethodCodeTemplate_ code for a particular AnalysisMethod (summary statistics for a continuous variable), the _ard_continuous_ function from the _cards_ package is used.  Notice, however, that two parameters are used in the code: 

1. 'byvariables_here', and 
2. 'analysisvariable_here'.  

These are example placeholder names, to be replaced by actual variables, depending on the context for the Analysis.  For example, 'byvariables_here' could be replaced by TRT01A, and 'analysisvariable_here' could be replaced by AGE, for an ARD with summary statistics on AGE by Treatment.  The same code, however, will be reused for summary statistics on e.g. HEIGHT for another Analysis, and the 'analysisvariable_here' will be replaced with HEIGHT.  All this is dynamically applied within the readARS_xl function when generating R scripts using _siera_. 

In order to facilitate the use of placeholders and their replacement with actual values from the metadata, several "constructs" have been defined in the _readARS_xl_ function, and are available to the user to call from the _AnalysisMethodCodeTemplate_ and _AnalysisMethodCodeParameters_ classes in the metadata.  These constructs are either simple variables or datasets defined in the metadata, or complex, dynamic constructs from various metadata pieces, to be inserted in specific _cards_ functions, or pre-processing steps.  A list of defined constructs and examples is shipped with this package, and can be accessed with:

```{r cardsconstructs, eval=FALSE}
ARS_example("cards_constructs.xlsx")
```


The _readARS_ function reads in a completed ARS metadata Excel file, and generates R scripts for each Output defined in the file.  Below is an example of how this can be done using a ready-to-use ARS metadata Excel file:

```{r Excel ARS metadata example, message=FALSE}

# Path to the Excel ARS metadata file: 
ARS_path <- ARS_example("Common_Safety_Displays_cards.xlsx")

# Path to a folder which will contain the Output meta-programmed R scripts (recommended to update
# to a more suitable local path)
output_folder <- tempdir()

# Path to the folder containing ADaM datasets in csv format (we will use temporary 
# directory tempdir() to make the code run in this vignette, but it's recommended to 
# 1. download the ADaMs required (csv ADSL and ADAE available using e.g. ARS_example("ADSL.csv"))
# 2. store it in a folder (the directly downloaded location can be found using dirname(ARS_example("ADSL.csv")), for example.  Use this location, or manually store the ADaM somewhere else)
# 3. use the folder path instead of tempdir() below.
ADaM_folder <- tempdir()

# run the readARS function with these 3 parameters.  This creates R scripts 
# (1 for each Output in output_folder)
readARS(ARS_path, output_folder, ADaM_folder)
```

If you've followed along so far, you should now have 5 R scripts (named ARD_Out14-1-1.R, ARD_Out14-3-1-1.R, ARD_Out14-3-2-1.R, ARD_Out14-3-3-1a.R, and ARD_Out14-3-3-1b.R) in the folder specified as _output_folder_.  If this is successful, you can execute any of these 5 R scripts as-is (assuming the ADaM required for this script is available in the _ADaM_folder_), and the result will be an ARD (one result per row format) for each of the scripts.  An example of such an auto-generated ARD script is included in the package, and can be used by using the `ARD_script_exampe` function. Running this script will look like this:

```{r run ARD_xxx, message=FALSE, warning=FALSE, eval=FALSE}

# Step 1: open ARD_xxx.R file
# Step 2: Confirm the location of ADaM dataset(s) is correct in the code section "Load ADaM".  
# For the sake of simplicity, the only update made to the ARD_Out14-1-1.R 
# script was to point to the ADaM folder to ARS_example("ADSL.csv")
# Step 3: Run the code and enjoy automated analysis results generation. 
# Note: the ARD is contained in the object "df4" (which contains the appended)
# mini-ARDs from all individual Analyses ARDs.

example_ARD_script = ARD_script_example("ARD_Out14-1-1.R")
source(example_ARD_script)
head(df4)
```