-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Description
Enter the chapter number
3.3
Enter the page number
184 to 202
What is the cell's number in the notebook
No response
Enter the environment you are using to run the notebook
None
Question
Hello
Second question :
It seems there is two schools in statistics about validating a model. The kind of “traditional” approach I was taught at school : making hypothesis about data (most often normality), deduce the law of the predictor, and deduce the 9x% confidence interval. Often, this approach not really convince me, as the hypothesis about normality is often made without serious justification, but because this is the only way to compute the confidence interval.
The "machine learning" approach, described in the book is, at the opposite, completely empirical : you said « my model work, because I checked it work on a test data ». So you don’t need any discussable hypothesis on the distribution of the data. At first glance it looked rather more rigorous than the first one. But actually, when you are doing this, you are estimating the unknown performance score your model would give on you entire univers (in the statistical sens) by computing a the score on you test sample. So, you are still doing a basic statistical inference. And who say “statistical inference” say “confidence interval” : you should be sure that you test sample is representative of your univers.
That means, when you compute a roc_auc or a precision/recall curve, theses curves should have a confidence interval around them. When you say “with this threshold, the precision is p%" you should give a CI around p%.
How do you deal with that ? Is there a way to compute this CI ?