-
Notifications
You must be signed in to change notification settings - Fork 22
Open
Description
Hi! I really like MICOM! But I am having a problem when trying to perform the Taxa Knockout Analysis
Problem description
I want to calculate the log fold change of the growth rates when performing the Taxa Knockouts. Currently, one can obtain one of three possible outputs when using com.knockout_taxa().
- using
method='raw'-> new: the new growth rates for the taxa in the community without the knocked taxon. - using
method=change` -> new - old: the absolute change in the growth rates for the new community without the knocked taxon with respect to the original community. - using
method=relative change` -> (new - old)/old: the relative change in the growth rates for the new community without the knocked taxon with respect to the original community.
What I actually want for the log fold change is something like log(new) - log(old) (or log(new/old), which is the same). So I attempted two things for trying to achieve this:
- Run
com.knockout_taxa()withmethodequal torawandchange, and then calculateold(the growth rates for the original community) as follows old = change - new (now that I know bothnewandoldI can easily calculate the log fold change using either of the expressions above). The problem is that, when I checked the values foroldfor all my samples, I consistently found some troubling values for some of the knocked taxa (always in one of the first three taxa in my matrix).
com = load_pickle('mouse_01'.pickle)
diet = load_qiime_medium("vmh_high_fiber.qza")
com.medium = diet.flux
ko_raw = com.knockout_taxa(fraction=0.6, method='raw', diag=False)
ko_change = com.knockout_taxa(fraction=0.6, method='change', diag=False)
old = ko_raw - ko_change
print(old) B_caccae B_cellulosilyticus_WH2 B_ovatus \
B_caccae NaN 0.045237 -0.023992
B_cellulosilyticus_WH2 0.000556 NaN 0.000966
B_ovatus 0.000556 0.037356 NaN
B_thetaiotaomicron 0.000556 0.037356 0.000986
B_uniformis 0.000556 0.037356 0.000986
B_vulgatus 0.000556 0.037356 0.000986
C_aerofaciens 0.000556 0.037356 0.000986
C_scindens 0.000556 0.037356 0.000986
C_spiroforme 0.000556 0.037356 0.000986
P_distasonis 0.000556 0.037356 0.000986
R_obeum 0.000556 0.037356 0.000986
B_thetaiotaomicron B_uniformis B_vulgatus \
B_caccae 0.066630 -0.002823 0.106982
B_cellulosilyticus_WH2 0.067358 0.000809 0.106279
B_ovatus 0.067358 0.000795 0.106279
B_thetaiotaomicron NaN 0.000795 0.106279
B_uniformis 0.067358 NaN 0.106279
B_vulgatus 0.067358 0.000795 NaN
C_aerofaciens 0.067358 0.000795 0.106279
C_scindens 0.067358 0.000795 0.106279
C_spiroforme 0.067358 0.000795 0.106279
P_distasonis 0.067358 0.000795 0.106279
R_obeum 0.067358 0.000795 0.106279
...
C_scindens 0.001470
C_spiroforme 0.001470
P_distasonis 0.001470
R_obeum NaN
As you can see, the results for old should be a matrix where the values for all the rows in a column should be the same, and it it true for most of the taxa but it fails for the first 3 taxa. This happens consistently across my 15 samples and even when using different media.
- So what I tried to do to verify what was happening was to clone the repo and modify it locally, so that when I ran
com.knockout_taxa()it returns me not only the new growth rates (when usingmethod=raw) but also the old values. The and, something that is also strange for me is that the values I am obtaining are quite different to the previous calculation (1.).
def knockout_taxa(community, taxa, fraction, method, progress, diag=True):
"""Knockout a taxon from the community."""
with community as com:
check_modification(com)
min_growth = _format_min_growth(0.0, com.taxa)
_apply_min_growth(com, min_growth)
com.objective = com.scale * com.variables.community_objective
community_min_growth = (
optimize_with_retry(com, "could not get community growth rate.") / com.scale
)
regularize_l2_norm(com, fraction * community_min_growth)
old = com.optimize().members["growth_rate"]
results = []
iter = track(taxa, description="Knockouts") if progress else taxa
for sp in iter:
with com:
logger.info("getting growth rates for %s knockout." % sp)
[
r.knock_out()
for r in com.reactions.query(lambda ri: ri.community_id == sp)
]
sol = optimize_with_fraction(com, fraction)
new = sol.members["growth_rate"]
if "change" in method:
new = new - old
if "relative" in method:
new /= old
results.append(new)
old = old.drop("medium", axis=1)
ko = pd.DataFrame(results, index=taxa).drop("medium", axis=1)
ko = ko.loc[ko.index.sort_values(), ko.columns.sort_values()]
if not diag:
np.fill_diagonal(ko.values, np.NaN)
return ko, oldcom = load_pickle('mouse_01'.pickle)
diet = load_qiime_medium("vmh_high_fiber.qza")
com.medium = diet.flux
new, old = com.knockout_taxa(fraction=0.6, method='raw', diag=False)
print(old) growth_rate
B_caccae 0.0135695707228535
B_cellulosilyticus_WH2 0.0272505030167579
B_ovatus 0.0114618034691539
B_thetaiotaomicron 0.119902654224808
B_uniformis 0.0132031895100882
B_vulgatus 0.105075269656302
C_aerofaciens 0.00680013487305695
C_scindens 0.00386220655742489
C_spiroforme 0.0250118398731469
P_distasonis 0.058633920577937
R_obeum 0.0226188024707752
Context
Details
Package Information
-------------------
micom 0.33.2
Dependency Information
----------------------
cobra 0.29.0
highspy missing
jinja2 3.1.3
osqp 0.6.5
scikit-learn 1.2.2
scipy 1.11.4
symengine 0.11.0
Build Tools Information
-----------------------
pip 23.3
setuptools 68.0.0
wheel 0.41.2
Platform Information
--------------------
Linux 6.5.0-26-generic-x86_64
CPython 3.10.13
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels