Skip to content

Chauvenet criterion wrong results #8

@baptistelabat-syroco

Description

@baptistelabat-syroco

Describe the bug
Chauvenet's criterion should consider around 1 sample of a normal distribution as a outlier.
Here it is considering 30 to 40% percent of point to be outliers

To Reproduce
from pythresh.thresholds.chau import CHAU
import numpy as np
normal_array = np.random.randn(99)
outlier_array = CHAU().eval(normal_array)
print(np.vstack([normal_array, outlier_array]).T)
np.sum(outlier_array)

Expected behavior
We tested on a normal distribution only a few points or zero should be considered as outliers.

Desktop (please complete the following information):

  • OS: linux
  • Version 24.04 LTS

Additional context
https://www.statisticshowto.com/chauvenets-criterion/
This table can be obtained with the following code:
prob_threshold = 1.0 / (2.0 * n)
number_of_tails = 2
threshold = -scipy_stats.norm.isf(1 - prob_threshold/number_of_tails)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions