Skip to content

log 0 and multiplicity of the p-value #3

@antpiron

Description

@antpiron

Hi,

I have observed that, on large list, the p-value underflow becomes exactly zero. In this case, the package takes log 0 and it results into an infinite result.
Here is a code to reproduce the issue:

library(RRHO)

df <- data.frame(gene = 1:5000, a = 1:5000, b = 1:5000)

RRHO_obj <-  RRHO::RRHO(df[, c("gene", "a")], df[, c("gene", "b")],
                        labels = c("a", "b"), alternative = "two.sided", plots = TRUE,
                        outputdir = "/tmp/")

The overlap map RRHOMapa_VS_b.jpg shows a big white patch for infinite. I think that it would be reasonable to replace the log 0, with the log of 4.94065645841247E-324 (see the smallest number above 0 for doubles, https://learn.microsoft.com/en-us/dotnet/api/system.double.epsilon?view=net-7.0 ).

Another issue is that sometime the minimal p-value is not unique, in this case the coordinate returned by

maxind.lr  <- which(
    max(hypermat.signed[1:(ceiling(nrow(hypermat.signed)/2)-1), 
                   1:(ceiling(ncol(hypermat.signed)/2)-1)],
        na.rm=TRUE) == hypermat.signed, arr.ind=TRUE)

is wrong because maxind.lr is then a matrix (not a vector) and maxind.lr[2] is the x of the second coordinate (and not the y of the first).

Thank you for your package.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions