--- title: "Dates, Times, and Datetimes" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{date_time_datetime} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup, echo=FALSE} library(datasetjson) library(knitr) ``` Dataset JSON Version 1.1 provides a significant improvement of the handling of dates and times. In version 1.0, there wasn't a clear instruction around anchoring the origin of numeric dates and date times, and given that SAS uses 1960-01-01 and R uses the POSIX date of 1970-01-01, this created a slightly complex discrepancy. Version 1.1 instead opts to use ISO8601 formatted dates, times, and date times, and the target data type is clearly stated within the column metadata. This makes the true date value unambiguous. Starting in **{datasetjson}** v0.3.0 we've introduce support for Dataset JSON v1.1.0. As such, we automatically handle date, time, and date time conversions. There are a few considerations you need to make when dealing with these types to make things work properly. # Metadata Settings Version 5 SAS Transport Files didn't have a notion of a "date", "time" or "datetime" type. Instead, using the SAS convention these were just Integer values with a display format attached. Dataset JSON Version 1.1 explicitly clarifies numeric date types using the `dataType` and `targetDataType` fields in the columns metadata. Consider these variables. ```{r, echo=FALSE} tibble::tribble( ~itemOID, ~name, ~label, ~dataType, ~length, ~targetDataType, ~displayFormat, ~keySequence, 'IT.DF.CHARDT', 'CHARDT', 'Character Date', 'date', 8L, NA_character_, NA_character_, NA_integer_, 'IT.DF.CHARTM', 'CHARTM', 'Character Time', 'time', 10L, NA_character_, NA_character_, NA_integer_, 'IT.DF.CHARDTM', 'CHARDTM', 'Character Datetime', 'datetime', 19L, NA_character_, NA_character_, NA_integer_, 'IT.DF.NUMDT', 'NUMDT', 'Numeric date', 'time', NA_integer_, "integer", "TIME8", NA_integer_, 'IT.DF.NUMTM', 'NUMTM', 'Numeric time', 'time', NA_integer_, "integer", "TIME8", NA_integer_, 'IT.DF.NUMDTM', 'NUMDTM', 'Numeric datetime', 'datetime', NA_integer_, "integer", "E8601DT", NA_integer_ ) |> kable() ``` In the table above, we have the metadata for both character and numeric dates, times, and date times. Both sets of variables have the same values within `dataType`. The difference is the optional field of `targetDataType`, where the value for the numeric variables is set to `integer`. Both `read_dataset_json()` and `write_dataset_json()` rely on these fields and as such they must be set properly. This comes with a few assumption and requirements. - Numeric dates will be converted into the type of `Date` (see `help("Date", package="base")`) - Numeric times will be converted to the **{hms}** type of `hms` - R doesn't have a specific built in type of time. We decided to take on **{hms}** as a dependency given that this is the type using by the **{haven}** package when reading SAS Version 5 Transport files. As such, similar behavior can be expected when importing an XPT or a Dataset JSON file. - Numeric date times will be converted to the base R type of `POSIXct` and anchored to the UTC timezone. - CDISC dates are generally not timezone qualified, though for character dates, this is optional. Unless a timezone is explicitly specified systems may default to the user's current timezone. To decrease ambiguity, we've introduced a hard requirement that datetimes are anchored to UTC. If the datetime variable is found to be using a different timezone, an error will be thrown. If any of these assumption don't work for your purpose or if you find other situations we need to handle, please leave an issue on Github as we want to make sure we support the community as best we can.