Conversation
There was a problem hiding this comment.
Hi @alyst,
Thank you very much for your PR and sorry for the delayed reaction. I do agree on your fix for the alpha filtering. However, it is not clear to me what problem you attempt to solve by preserving the matrix dimensions (see also my comment in the PR below). Could you provide an example of the issue that you are solving?
R/topFeatures.R
Outdated
| contrast <- contrast[rowSums(contrast) != 0, , drop = FALSE] | ||
| } | ||
|
|
||
| contrast <- contrast[contrast !=0] |
There was a problem hiding this comment.
I don't understand what you attempt to solve here. contrast can only contain 1 column, so why does it matter to use drop = FALSE?
Also, you moved the filtering out of 0 inside the if statement, meaning that now zeros are removed only if contrast is a matrix. I'm not sure whether removing 0 is really necessary, but if it is, this should happen also when contrast is a vector.
There was a problem hiding this comment.
I don't understand what you attempt to solve here.
I can provide the exact errors we get before the fix, but I have thought it is quite obvious.
Essentially, it keeps rows filtering introduced in #55, but keeps contrast a matrix as it was before #55.
R automatically squeezes singleton dimensions, so without drop = FALSE contrast matrix becomes a vector and gets stripped of the row names.
The latter is critical -- while the methods like getContrast() attempt to convert the L vector back to the matrix, they still fail, because the names are lost.
There was a problem hiding this comment.
Thanks, this is clear. I can reproduce your issue:
data(pe)
pe <- aggregateFeatures(pe, i = "peptide", fcol = "Proteins", name = "protein")
pe <- msqrob(pe, i = "protein", formula = ~condition)
getCoef(rowData(pe[["protein"]])$msqrobModels[[1]])
## Generate a contrast matrix with only 1 column
L <- makeContrast("conditionc - conditionb=0", c("conditionb", "conditionc"))
topDeProteins <- topFeatures(rowData(pe[["protein"]])$msqrobModels, L)It returns a table filled with NAs. So, good catch! 🪲
However, regarding your solution, I would be more in favor of converting the matrix into a vector first, and then perform filtering, regardless whether the contrast was first a vector or a matrix. So here is my suggestion:
...
if (is(contrast, "matrix")) {
if (ncol(contrast) > 1) {
stop("Argument contrast is matrix with more than one column, only one contrast is allowed")
}
## convert 1-column matrix into a vector (preserving names)
contrast <- contrast[, 1]
}
## remove unused coefficients
contrast <- contrast[contrast != 0]
...What do you think?
There was a problem hiding this comment.
I think it's quite standard for R to have contrasts either as a matrix or as a formula(s), so your suggestion would diverge from that practice.
Also, getContrast() and varContrast(), which are called for each model later in the code, will convert the contrasts vector back to a matrix generating some overhead.
Actually, I was going to suggest that it would be easy for topFeatures to support multiple contrasts
(every model-contrast pair generates a row in the unfiltered output; p-value adjustments are done per-contrast or globally).
That's quite frequent use case for more complex experimental designs, and handling it in topFeatures() will reduce overhead of calling getContrast() and varContrast() for each contrast + save the users from writing boilerplate code.
But it would require reverting matrix-to-vector conversion.
There was a problem hiding this comment.
Ok, again I agree with your reasoning, but then we should enforce contrast to be a matrix. Something like:
...
if (!inherits(contrast, "matrix")) {
contrast <- as.matrix(contrast)
}
if (ncol(contrast) > 1) {
stop("Argument contrast is matrix with more than one column, only one contrast is allowed")
}
## remove unused coefficients
contrast <- contrast[contrast != 0, , drop = FALSE]
...My point is that is doesn't make sense to me to treat a vector or a matrix differently. If we remove 0s, that should happen for both cases. Therefore, I moved the subsetting of 0s outside the if statement.
Regarding testing multiple contrasts, hypothesisTest() (which internally calls topFeatures()) already allows this. The different tables are then stored in the rowData of your input object. But let's discuss this in a dedicated issue if that's not what you meant.
There was a problem hiding this comment.
Makes sense, I've updated the PR, so that contrast is converted to a matrix.
|
Once more, thanks a lot @alyst for your contribution! |
|
@cvanderaa Happy to contribute to msqrob2, thank you for the careful in-depth review! |
contrastmatrix filtering, so that it does not loose dimensions.