Introduction to datetimeoffset

Table of Contents

Overview

{datetimeoffset} provides support for datetimes with optional UTC offsets and/or (possibly heteregeneous) time zones. Strengths compared to other R datetime objects:

  1. Import/export for a number of datetime string standards often including lossless re-export of any original reduced precision
  2. Datetimes can be augmented with optional UTC offsets and/or (possibly heteregeneous) time zones.
  3. Can support up to nanosecond precision.

The motivating use case for this package was the need for a datetime aware class that can losslessy import/export pdf metadata datetimes for {xmpdf}. pdf metadata datetimes are local times with a wide range of legal precisions but with unknown time zones but a possibly known UTC offset. Generally pre-existing R datetime classes either assume knowledge of a (usually single) time zone or alternatively assumed it was acceptable to fully convert to UTC time.

Examples

Importing/exporting datetime string formats

{datetimeoffset} can import/export a number of datetime formats. Supports lossless re-export of any original reduced precision for a number of formats such as pdfmark datetime strings and ISO 8601 datetime strings.

library("datetimeoffset")

ISO 8601 datetimes

as_datetimeoffset("2020-05") |> format_iso8601()
## [1] "2020-05"
as_datetimeoffset("2020-05-10T20:10") |> format_iso8601()
## [1] "2020-05-10T20:10"
as_datetimeoffset("2020-05-10T20:10:15.003-07") |> format_iso8601()
## [1] "2020-05-10T20:10:15.003-07"
as_datetimeoffset("2020-05-10 20:10:15Z") |> format_iso8601()
## [1] "2020-05-10T20:10:15Z"

pdfmark datetimes

as_datetimeoffset("D:202005") |> format_pdfmark()
## [1] "D:202005"
as_datetimeoffset("D:20200510201015+00'00'") |> format_pdfmark()
## [1] "D:20200510201015+00'00'"

RFC 3339 with de facto time zone extension datetimes

as_datetimeoffset("2020-05-10T20:10:15.003[America/Los_Angeles]") |>
    format()
## [1] "2020-05-10T20:10:15.003-07:00[America/Los_Angeles]"
as_datetimeoffset("2020-05-10T20:10-07:00[America/Los_Angeles]") |>
    format()
## [1] "2020-05-10T20:10-07:00[America/Los_Angeles]"

SQL Server/ODBC datetime string literals

# SQL Server Date / ODBC SQL_TYPE_DATE / SQL_DATE
as_datetimeoffset("2020-05-10") |>
    format_nanotime("%F")
## [1] "2020-05-10"
# SQL Server Smalldatetime / ODBC SQL_TYPE_TIMESTAMP / SQL_TIMESTAMP
as_datetimeoffset("2020-05-10 20:10:15") |>
    format_nanotime("%F %T")
## [1] "2020-05-10 20:10:15"
# SQL Server Datetime / ODBC SQL_TYPE_TIMESTAMP / SQL_TIMESTAMP
as_datetimeoffset("2020-05-10 20:10:15.123") |>
    format_nanotime("%F %H:%M:%E3S")
## [1] "2020-05-10 20:10:15.123"
# SQL Server Datetime2 / ODBC SQL_TYPE_TIMESTAMP / SQL_TIMESTAMP
as_datetimeoffset("2020-05-10 20:10:15.1234567") |>
    format_nanotime("%F %H:%M:%E7S")
## [1] "2020-05-10 20:10:15.1234567"
# SQL Server DatetimeOFFSET / ODBC SQL_SS_TIMESTAMPOFFSET
as_datetimeoffset("2020-05-10 20:10:15.1234567 -07:00") |>
    format_nanotime("%F %H:%M:%E7S %Ez")
## [1] "2020-05-10 20:10:15.1234567 -07:00"

Extended Date Time Format (EDTF)

as_datetimeoffset("2020-10-05T10:10:10") |> format_edtf()
## [1] "2020-10-05T10:10:10"
as_datetimeoffset("2020-XX-05") |> format_edtf()
## [1] "2020-XX-05"
# Lossy EDTF import situations
as_datetimeoffset("20XX-10-10") |> format_edtf()
## [1] "XXXX-10-10"
as_datetimeoffset("2020-10-XX") == as_datetimeoffset("2020-10")
## [1] TRUE
# Extensions to EDTF format
as_datetimeoffset("2020-XX-19T10:XX:10") |>
    format_edtf(precision = "nanosecond", usetz = TRUE)
## [1] "2020-XX-19T10:XX:10.XXXXXXXXX+XX:XX[X]"

Miscellaneous datetimes

as_datetimeoffset("1918/11/11 11:11") |>
    format_strftime(usetz = TRUE)
## [1] "1918-11-11 11:11:00 PST"

Heterogeneous time zones

datetimeoffset() objects support heteregenous time zones:

# Current time in a number of time zones
datetimeoffset_now(c("America/Los_Angeles", "America/New_York",
                     "Europe/London", "Asia/Shanghai"))
## <datetimeoffset[4]>
## [1] 2022-12-21T18:09:02.748841822-08:00[America/Los_Angeles]
## [2] 2022-12-21T21:09:02.748841822-05:00[America/New_York]   
## [3] 2022-12-22T02:09:02.748841822+00:00[Europe/London]      
## [4] 2022-12-22T10:09:02.748841822+08:00[Asia/Shanghai]

Augmenting pdf datetime metadata

By default grDevices::pdf() stores the local datetime without any UTC offset information:

library("grid")
library("xmpdf") # remotes::install_github("trevorld/r-xmpdf")

creation_date <- datetimeoffset_now()
print(creation_date)
## <datetimeoffset[1]>
## [1] 2022-12-21T18:09:02.855085140-08:00[America/Los_Angeles]
# Create a two page pdf using `pdf()`
f <- tempfile(fileext = ".pdf")
pdf(f, onefile = TRUE)
grid.text("Page 1")
grid.newpage()
grid.text("Page 2")
Sys.sleep(5L) # sleep to confirm time matches start of `pdf()` call
invisible(dev.off())

di <- xmpdf::get_docinfo(f)[[1]]
print(di)
## Author: NULL
## CreationDate: 2022-12-21T18:09:02
## Creator: R
## Producer: R 4.2.2
## Title: R Graphics Output
## Subject: NULL
## Keywords: NULL
## ModDate: 2022-12-21T18:09:02

We can use {datetimeoffset} with {xmpdf} to augment the embedded datetime metadata to also include the UTC offset information:

di$creation_date <- di$creation_date |>
    set_hour_offset(get_hour_offset(creation_date)) |>
    set_minute_offset(get_minute_offset(creation_date))
di$mod_date <- datetimeoffset_now() # Last modified metadata now
di$subject <- "Augmenting pdf metadata with UTC offsets"

xmpdf::set_docinfo(di, f)
di <- xmpdf::get_docinfo(f)[[1]]
print(di)
## Author: NULL
## CreationDate: 2022-12-21T18:09:02-08:00
## Creator: R
## Producer: GPL Ghostscript 9.55.0
## Title: R Graphics Output
## Subject: Augmenting pdf metadata with UTC offsets
## Keywords: NULL
## ModDate: 2022-12-21T18:09:08-08:00

Features

Comparison with {clock}

Note: Please feel free to open a pull request to fix any {clock} mis-understandings or statements that are now out-of-date.

{datetimeoffset} is most similar to the excellent {clock} (which {datetimeoffset} uses internally):

Things {clock} can do that {datetimeoffset} can’t do

Things {datetimeoffset} can do that {clock} can’t do

Comparison with {parttime}

Note: Please feel free to open a pull request to fix any {parttime} mis-understandings or statements that are now out-of-date.

A {datetimeoffset} is also similar to the excellent {parttime}:

Serializing

dts <- datetimeoffset(year = c(2020, 1980), month = c(NA, 10), day = c(15, NA))
format_edtf(dts)
## [1] "2020-XX-15" "1980-10"
# serialize via data frame
df <- vctrs::vec_data(dts)
print(df)
##   year month day hour minute second nanosecond subsecond_digits hour_offset
## 1 2020    NA  15   NA     NA     NA         NA               NA          NA
## 2 1980    10  NA   NA     NA     NA         NA               NA          NA
##   minute_offset   tz
## 1            NA <NA>
## 2            NA <NA>
dts_df <- do.call(datetimeoffset, as.list(df))
all.equal(dts, dts_df)
## [1] TRUE
# serialize via base::serialize() or base::saveRDS()
x <- serialize(dts, NULL) # raw binary vector
dts_x <- unserialize(x)
all.equal(dts, dts_x)
## [1] TRUE