Skip to content

Conversation

@Gero1999
Copy link
Contributor

@Gero1999 Gero1999 commented Jun 10, 2025

Makes PKNCAdata$units table able to include any group variable stratifying the units in PKNCAconc / PKNCAdose .

Implementation Summary:

  • Added pknca_units_table_from_pknca. Creates a default units table based on a PKNCAconc object (and optionally PKNCAdose as well).

  • Introduced helper functions ensure_column_unit_exists and select_minimal_grouping_cols to support unit column validation and minimal grouping column selection.

  • Updated PKNCAdata.default to replace manual unit table construction with the new pknca_units_table_from_pknca function.

  • Added tests for pknca_units_table_from_pknca to verify its behavior with various scenarios, including:

  • Added new imports (dplyr, rlang, tidyr, utils) and exported pknca_units_table_from_pknca to make the new functionality accessible.

@Gero1999
Copy link
Contributor Author

hey @billdenney I adapted some things we did on aNCA to include it as default behaviour in PKNCAdata. This should flexibilize the reading of unit columns in PKNCAconc (and PKNCAdose) related to #416. I added also tests to make sure this is translated into the results, but let me know if I miss something!

I also added some #TODO comments of the main changes I needed to do to adapt the function to PKNCA

@Gero1999 Gero1999 marked this pull request as ready for review June 16, 2025 12:34
@Gero1999 Gero1999 changed the title Allow custom non-unique units (PPORRESU) based on PKNCAconc & PKNCAdose obj Allow non-unique units columns (PKNCAconc & PKNCAdose) Jun 27, 2025
@billdenney
Copy link
Member

Instead of creating a new pknca_units_table_from_pknca() function, can you please make pknca_units_table() to be S3 generic with something like:

pknca_units_table <- function(concu, ...) {
  UseMethod("pknca_units_table")
}

pknca_units_table.default <- function(concu, doseu, amountu, timeu,
                              concu_pref = NULL, doseu_pref = NULL, amountu_pref = NULL, timeu_pref = NULL,
                              conversions = data.frame(), ...) {
  # the current function
}

pknca_units_table.PKNCAdata <- function(concu, ... conversions = data.frame()) {
  # Your function with conversions handled, too.
}

all_unit_cols <- c(concu_col, amountu_col, timeu_col, doseu_col)

# Join dose units with concentration group columns and units
d_concu <- o_conc$data %>%
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use base R for readability:

unit_cols <- c(group_conc_cols, concu_col, amountu_col, timeu_col)
d_concu <- as.data.frame(o_conc)
d_concu <- unique(d_concu[, intersect(names(d_concu), unit_cols)])
unit_cols <- c(group_conc_cols, doseu_col)
d_doseu <- as.data.frame(o_dose)
d_doseu <- unique(d_doseu[, intersect(names(d_doseu), unit_cols)])

groups_units_tbl <- merge(d_concu, d_doseu, all.x = TRUE) %>%
dplyr::mutate(dplyr::across(dplyr::everything(), ~ as.character(.))) %>%
dplyr::group_by(!!!rlang::syms(group_conc_cols)) %>%
tidyr::fill(!!!rlang::syms(all_unit_cols), .direction = "downup") %>%
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is fill necessary here? It seems like it could go against the user's intent as they don't have units in their source data.


groups_units_tbl <- merge(d_concu, d_doseu, all.x = TRUE) %>%
dplyr::mutate(dplyr::across(dplyr::everything(), ~ as.character(.))) %>%
dplyr::group_by(!!!rlang::syms(group_conc_cols)) %>%
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use dplyr::grouped_df() instead of indirection via rlang::syms().

unique()

groups_units_tbl <- merge(d_concu, d_doseu, all.x = TRUE) %>%
dplyr::mutate(dplyr::across(dplyr::everything(), ~ as.character(.))) %>%
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to convert everything to character? I think that it should all be character from the original input. If it is not (perhaps it's a factor), then we should likely handle factor conversion elsewhere. This seems like it is possibly hiding a data problem from the user (possibly causing a hard-to-trace bug in the future) and should be corrected in source data rather than here.

stop(
"Units should be uniform at least across concentration groups. ",
"Review the units for the next group(s):\n",
paste(utils::capture.output(print(mismatching_units_groups)), collapse = "\n")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than capture.output, please directly generate the desired string like:

do.call(paste, lapply(X = names(d), FUN = function(x) paste(x, d[[x]], sep = "=")))

units.are.all.na <- all(is.na(groups_units_tbl[,all_unit_cols]))
if (units.are.all.na) return(NULL)

groups_units_tbl %>%
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please convert this to a loop for readability (it will only likely be called a few times, so it should not be a performance issue).

#' 2. If a column does not exist, it creates the column and assigns default values.
#' 3. If not default values are provided, it assigns NA to the new column.
#' @keywords Internal
ensure_column_unit_exists <- function(pknca_obj, unit_name) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this is already handled as part of the pknca_set_units() function. Can this function be eliminated, and if not, can you please explain why not?

}

# Add globalVariables for NSE/dplyr/rlang/tidyr usage
utils::globalVariables(c(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use double colon notation or rlang::.data rather than globalVariables()

@billdenney
Copy link
Member

Please also merge the current main branch upon making the above changes.

@Gero1999 Gero1999 marked this pull request as draft November 24, 2025 21:31
…PKNCAdata (conflicts with no real issue in unit-support.R & NAMESPACE)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants