-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Labels
Ranything related to R programming languageanything related to R programming languageoptimizationThis works, but could've been betterThis works, but could've been better
Description
Hi all,
I have created a function that performs a last observation carried forward that takes account of how long ago the last observation was recorded. Code for the function below.
library(tibble)
library(lubridate)
library(zoo)
library(testthat)
locf_window <- function(x, by, date, window, unit = "hours"){
# Perform last observation carried forward (LOCF) based on the time difference
# to the last measured observation. Allow for stratification by identifier
# (e.g.patient ID)
#
# Parameters
# x : character or numeric
# vector of measurements on which to perform LOCF
# by : character or numeric
# vector indicating the group to stratify by
# date : datetime
# vector of times at which the value in 'x' was attempted to be measured.
# window : numeric
# length of the time window
# units : character
# units of the time window
if (is.character(x)) {
placeholder <- "-Infinity"
} else if (is.numeric(x)) {
placeholder <- -Inf
} else {
stop("vector 'x' must either be character or numeric")
}
x <- if_else(is.na(x) & by != lag(by), placeholder, x)
date_measure <- as_datetime(ifelse(!is.na(x), date, NA))
date_measure <- zoo::na.locf(date_measure)
n_measure <- unlist(tapply(!is.na(x), by, cumsum))
date_measure <- as_datetime(ifelse(n_measure != 0, date_measure, NA))
x <- if_else(is.na(x) & !is.na(date_measure) &
time_length(lag(date_measure) %--% date, unit = unit) < window,
zoo::na.locf(x, na.rm = FALSE), x, x)
x[x == placeholder] <- NA
x
}
test_locf <- tribble(
~patid, ~start_date , ~value,
1, ymd_hms("2010-01-05 12:00:00"), 5,
1, ymd_hms("2010-01-05 13:00:00"), NA,
1, ymd_hms("2010-01-05 15:59:59"), NA,
1, ymd_hms("2010-01-05 17:00:00"), NA,
1, ymd_hms("2010-01-05 18:00:00"), 10,
2, ymd_hms("2010-01-05 13:00:00"), NA,
2, ymd_hms("2010-01-05 14:00:00"), NA,
2, ymd_hms("2010-01-05 15:00:00"), 2,
2, ymd_hms("2010-01-05 15:31:01"), NA,
2, ymd_hms("2010-01-06 16:00:00"), NA
) %>% as.data.table()
with(test_locf, {
expect_identical(locf_window(value, start_date, 4, by = patid), c(5, 5, 5, NA, 10, NA, NA, 2, 2, NA))
expect_error(locf_window(list(2), ymd_hms("2010-01-06 16:00:00"), 4, by = patid))
})
The function generally seems to work, but it seems overly complicated. Does anyone have an idea of how to simplify or speed up the function?
P
Metadata
Metadata
Assignees
Labels
Ranything related to R programming languageanything related to R programming languageoptimizationThis works, but could've been betterThis works, but could've been better