fastymd is a package for working with Year-Month-Day (YMD) style date objects. It provides extremely fast passing of character strings and numeric values to date objects as well as fast decomposition of these in to their year, month and day components. The underlying algorithms follow the approach of Howard Hinnant for calculating days from the UNIX Epoch of Gregorian Calendar dates and vice versa.
The API won’t give any surprises:
library(fastymd)
cdate <- c("2025-04-16", "2025-04-17")
(res <- fymd(cdate))
#> [1] "2025-04-16" "2025-04-17"
res == as.Date(cdate)
#> [1] TRUE TRUE
get_ymd(res)
#> year month day
#> 1 2025 4 16
#> 2 2025 4 17
fymd(2025, 4, 16) == res[1L]
#> [1] TRUE
Invalid dates will return NA
and a warning:
fymd(2021, 02, 29) # not a leap year
#> NAs introduced due to invalid month and/or day combinations.
#> [1] NA
More interesting is the handling of output after a valid date. Consider the following timestamp:
timelt <- as.POSIXlt(Sys.time(), tz = "UTC")
(timestamp <- strftime(timelt , "%Y-%m-%dT%H:%M:%S%z"))
#> [1] "2025-05-12T19:11:06+0000"
By default the time element is ignored:
(res <- fymd(timestamp))
#> [1] "2025-05-12"
res == as.Date(timestamp, tz = "UTC")
#> [1] TRUE
This ignoring of the timestamp is both good and bad. For timestamps it makes
perfect sense, but perhaps you have simple dates and a concern that some are
corrupted. For these we can use the strict
argument:
cdate <- "2025-04-16nonsense "
fymd(cdate)
#> [1] "2025-04-16"
fymd(cdate, strict = TRUE)
#> NAs introduced due to invalid date strings.
#> [1] NA
The character method of fymd()
parses input strings in a fixed, year, month
and day order. These values must be digits but can be separated by any non-digit
character. This is similar in spirit to the fastDate()
function in Simon
Urbanek’s fasttime package, using
pure text parsing and no system calls for maximum speed.
For extremely fast passing of POSIX style timestamps you will struggle to beat the performance of fasttime. This works fantastically for timestamps that do not need validation and are within the date range supported by the package (currently 1970-01-01 through to the year 2199).
fymd()
fills the, admittedly small, niche where you want fast parsing of YMD
strings along with date validation and support for a wider range of dates from
the Proleptic Gregorian calendar
(currently we support years in the range [-9999, 9999]
). This additional
capability does come with a small performance penalty but, hopefully, this has
been kept to a minimum and the implementation remains competitive.
library(microbenchmark)
# 1970-01-01 (UNIX epoch) to "2199-01-01"
dates <- seq.Date(from = .Date(0), to = fymd("2199-01-01"), by = "day")
# comparison timings for fymd (character method)
cdates <- format(dates)
(res_c <- microbenchmark(
fasttime = fasttime::fastDate(cdates),
fastymd = fymd(cdates),
ymd = ymd::ymd(cdates),
lubridate = lubridate::ymd(cdates),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fasttime 528.702 533.631 560.6187 536.527 540.419 1939.710 100
#> fastymd 775.045 780.199 794.6313 784.697 788.530 1130.412 100
#> ymd 4444.390 4487.957 4607.3698 4507.409 4555.168 5968.801 100
#> lubridate 4956.522 5065.686 6075.4734 5176.645 6562.420 37051.120 100
# comparison timings for fymd (numeric method)
ymd <- get_ymd(dates)
(res_n <- microbenchmark(
fastymd = fymd(ymd[[1]], ymd[[2]], ymd[[3]]),
lubridate = lubridate::make_date(ymd[[1]], ymd[[2]], ymd[[3]]),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fastymd 373.440 375.765 425.1393 378.861 381.897 2289.366 100
#> lubridate 534.272 542.949 660.0808 547.262 552.196 2261.223 100
# comparison timings for year getter
(res_get_year <- microbenchmark(
fastymd = get_year(dates),
ymd = ymd::year(dates),
lubridate = lubridate::year(dates),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fastymd 483.708 497.4185 567.4244 501.1405 506.2355 2148.081 100
#> ymd 498.245 505.6640 521.0937 510.2580 514.2100 775.766 100
#> lubridate 7593.239 7605.0920 7891.8327 7618.5575 7650.4620 9812.684 100
# comparison timings for month getter
(res_get_month <- microbenchmark(
fastymd = get_month(dates),
ymd = ymd::month(dates),
lubridate = lubridate::month(dates),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fastymd 449.674 465.243 569.8989 468.5995 473.1580 2142.421 100
#> ymd 532.589 536.547 550.2292 539.6225 543.1995 805.552 100
#> lubridate 8202.122 8243.695 8996.7768 8268.0460 9351.0135 40174.112 100
# comparison timings for mday getter
(res_get_mday <- microbenchmark(
fastymd = get_mday(dates),
ymd = ymd::mday(dates),
lubridate = lubridate::day(dates),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fastymd 451.738 462.253 522.9238 465.584 469.496 1871.983 100
#> ymd 535.996 539.823 558.3011 541.927 545.193 1798.335 100
#> lubridate 7530.001 7550.941 7771.7465 7564.346 7592.257 9609.052 100