Open
Conversation
Owner
|
Hi I am just going through these older pull requests, sorry that i hadnt responded before, i hadnt noticed all these pull requests. Unless you disagree I think this one is still relevant and should also be merged into the master. Sorry about this, it is much appreciated the time you have spent on this. |
Contributor
Author
|
I think it is still relevant, tho I implemented it for a project that is long since finished. If it seems like a useful feature feel free to merge |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Added option to calculate the mutual information instead of Fst, as an choice for
realSFS fst. MI has nice properties (it is a true metric, it is additive, satisfies triangle inequality etc) which Fst does not. Some prefer MI to Fst for selection scans, demographics, etc. A reference is here.The MI is calculated with
-whichFst 2flag torealSFS fst index. Here, the numerator is the mutual information, and the denominator is the joint entropy: so the "global" result withrealSFS fst statsis the normalized mutual information (aka Shannon differentiation) -- a metric that is bounded in [0,1]. Like with vanilla fst, theprintoption prints out the numerator (MI) and denominator (joint entropy) per site. Because of the additivity property, the whole weighted vs. unweighted distinction is moot. To avoid redundant code,realSFS fst statsstill labels the output as "Fst" even though it is not (and gives meaningless population branch statistics, with three populations). But, the initialrealSFS fst indexprints a warning to this effect.I also changed the formatting for
realSFS fst printto output numbers with a higher precision and switch to scientific notation if necessary. This avoids annoying round-off errors where both numerator and denominator are small.Finally, I updated the help message for
realSFS fstto reflect these updates and also the other options ... copying from the wiki where possible.