Thermodynamics integration¶
Geckopy provides an integration layer with pytfa to run Thermodynamic Flux Analysis with an enzyme constraint model.
from pathlib import Path
import geckopy
import numpy as np
import pandas as pd
import pytfa
from cobra.flux_analysis.variability import flux_variability_analysis
from geckopy.integration.pytfa import (
adapt_gecko_to_thermo,
translate_model_mnx_to_seed,
)
from geckopy.experimental import from_copy_number
from geckopy.experimental.relaxation import apply_proteomics_relaxation
from geckopy.experimental.molecular_weights import extract_proteins
from pytfa.io import load_thermoDB
from pytfa.io.plotting import plot_fva_tva_comparison
from pytfa.optim.variables import DeltaG,DeltaGstd,ThermoDisplacement
from pytfa.analysis import variability_analysis
DATA = Path.cwd().parent / "tests" / "data"
Load the model¶
Load the enzyme constraint model and transform it to a thermo model (taking into account the proteins).
ec_model = geckopy.io.read_sbml_ec_model(DATA / "ec_coli_core.xml", hardcoded_rev_reactions=False)
We fix the glucose exchange rate so that the FVA runs are more comparable between each other.
ec_model.slim_optimize()
ec_model.reactions.EX_glc__D_e.bounds = ec_model.reactions.EX_glc__D_e.flux, ec_model.reactions.EX_glc__D_e.flux
Geckopy to pytfa adapter. There is a translation of ids although in this case the model already contains seed identifiers.
thermodb = load_thermoDB(DATA / "thermo_data.thermodb")
compartment_data = pytfa.io.read_compartment_data(
str(DATA / "compartment_data.json")
)
# ranslate the mnx ids in the model to seed ids
translate_model_mnx_to_seed(
ec_model, thermodb, DATA / "chem_xref_seedset.tsv"
)
tmodel = adapt_gecko_to_thermo(
ec_model, thermodb, compartment_data, solver="optlang-glpk"
)
tmodel.print_info()
2021-05-21 17:30:42,353 - thermomodel_ - INFO - # Model initialized with units kcal/mol and temperature 298.15 K
2021-05-21 17:30:42,358 - thermomodel_ - INFO - # Model preparation starting...
2021-05-21 17:30:42,406 - thermomodel_ - INFO - # Model preparation done.
2021-05-21 17:30:42,434 - thermomodel_ - INFO - # Model conversion starting...
2021-05-21 17:30:43,134 - thermomodel_ - INFO - # Model conversion done.
2021-05-21 17:30:43,135 - thermomodel_ - INFO - # Updating cobra_model variables...
2021-05-21 17:30:43,139 - thermomodel_ - INFO - # cobra_model variables are up-to-date
value
key
name
description
num constraints 631
num variables 763
num metabolites 72
num reactions 95
value
key
num metabolites(thermo) 127
num reactions(thermo) 73
pct metabolites(thermo) 176.389
pct reactions(thermo) 76.8421
Check that the model works.
solution = tmodel.optimize()
solution
| fluxes | reduced_costs | |
|---|---|---|
| PFK | 9.627754 | None |
| PFL | 0.000000 | None |
| PGI | 9.923283 | None |
| PGK | -19.005185 | None |
| PGL | 0.000000 | None |
| ... | ... | ... |
| NADH16 | 30.034558 | None |
| NADTRHD | 0.000000 | None |
| NH4t | 2.040601 | None |
| O2t | 15.017279 | None |
| PDH | 16.118563 | None |
95 rows × 2 columns
Flux variability analysis¶
Calculate variability analysis on all continuous variables
Plain FVA¶
We use the FVA from pytfa for consistency here (which requires the model
to have and logger property).
Since this model does not have splitted reactions (so that proteins are
reactants in both directions), we don’t need to take additional measures
to make the different runs comparable. See
flux_variability_analysis() that takes
that into account if that is not the case.
import logging
# avoid loading the screen with logs
logging.basicConfig(filename='docs03.log', level=logging.DEBUG)
ec_model.logger = logging.getLogger(__name__)
fva_fluxes = variability_analysis(ec_model.copy(), kind="reaction")
minimizing: 100%|██████████| 95/95 [00:00<00:00, 810.72it/s]
maximizing: 100%|██████████| 95/95 [00:00<00:00, 859.12it/s]
Thermo FVA¶
tva_fluxes = variability_analysis(tmodel, kind="reaction")
2021-05-21 17:30:45,685 - thermomodel_ - INFO - Beginning variability analysis for variable of type reaction
minimizing: 100%|██████████| 95/95 [00:27<00:00, 3.42it/s]
maximizing: 100%|██████████| 95/95 [00:22<00:00, 4.26it/s]
Thermo concentration proteomics FVA¶
We can apply both proteomics and thermodynamics (which may include metabolomics) constraints.
This will usually require a relaxation of the experimental constraints or the thermodynamic constraint, see the relaxation chapter for a detailed explanation.
raw_proteomics = pd.read_csv(DATA / "ecoli_proteomics_schmidt2016S5.tsv")
ec_model_constrained = from_copy_number(
ec_model.copy(),
index=raw_proteomics["uniprot"],
cell_copies=raw_proteomics["copies_per_cell"],
stdev=raw_proteomics["stdev"],
vol=2.3,
dens=1.105e-12,
water=0.3,
)
tmodel_prot = adapt_gecko_to_thermo(
ec_model_constrained, thermodb, compartment_data, solver="optlang-glpk"
)
# constrain the model objective so that the feashibility relaxation recovers growth
tmodel_prot.reactions.BIOMASS_Ecoli_core_w_GAM.lower_bound = solution.objective_value
iis, status = geckopy.integration.relax_thermo_proteins(
tmodel_prot,
prot_candidates=[prot.id for prot in tmodel_prot.proteins],
objective_rule=geckopy.experimental.relaxation.Objective_rule.MIN_ELASTIC_SUM
)
print(f"IIS (status '{status}'): {iis}")
2021-05-21 17:31:36,086 - thermomodel_ - INFO - # Model initialized with units kcal/mol and temperature 298.15 K
2021-05-21 17:31:36,090 - thermomodel_ - INFO - # Model preparation starting...
2021-05-21 17:31:36,138 - thermomodel_ - INFO - # Model preparation done.
2021-05-21 17:31:36,165 - thermomodel_ - INFO - # Model conversion starting...
2021-05-21 17:31:36,867 - thermomodel_ - INFO - # Model conversion done.
2021-05-21 17:31:36,868 - thermomodel_ - INFO - # Updating cobra_model variables...
2021-05-21 17:31:36,871 - thermomodel_ - INFO - # cobra_model variables are up-to-date
adding thermo slacks: 100%|██████████| 73/73 [00:00<00:00, 308.11it/s]
adding protein slacks: 100%|██████████| 55/55 [00:00<00:00, 421.43it/s]
IIS (status 'optimal'): {'prot_P0AB71', 'prot_P00370', 'prot_P0A9P0', 'prot_P0A9N4', 'prot_P25516', 'prot_P21599', 'prot_P0A9C5', 'prot_P0A9B2', 'prot_P06999', 'prot_P37689', 'prot_P06959', 'prot_P0AFG8', 'prot_P0A6P9', 'prot_P33221', 'prot_P0A867'}
Since this relaxation is an in-place operation, we need to first rebuild the model.
In this case, all of the problematic constraints are proteins, so we will simply remove their concentrations for illustration purposes.
ec_model_constrained = from_copy_number(
ec_model.copy(),
index=raw_proteomics["uniprot"],
cell_copies=raw_proteomics["copies_per_cell"],
stdev=raw_proteomics["stdev"],
vol=2.3,
dens=1.105e-12,
water=0.3,
)
tmodel_prot = adapt_gecko_to_thermo(
ec_model_constrained, thermodb, compartment_data, solver="optlang-glpk"
)
for prot_id in iis:
tmodel_prot.proteins.get_by_id(prot_id).add_concentration(None)
2021-05-21 17:31:49,759 - thermomodel_ - INFO - # Model initialized with units kcal/mol and temperature 298.15 K
2021-05-21 17:31:49,763 - thermomodel_ - INFO - # Model preparation starting...
2021-05-21 17:31:49,810 - thermomodel_ - INFO - # Model preparation done.
2021-05-21 17:31:49,838 - thermomodel_ - INFO - # Model conversion starting...
2021-05-21 17:31:50,545 - thermomodel_ - INFO - # Model conversion done.
2021-05-21 17:31:50,546 - thermomodel_ - INFO - # Updating cobra_model variables...
2021-05-21 17:31:50,548 - thermomodel_ - INFO - # cobra_model variables are up-to-date
tmodel_prot.slim_optimize()
0.3742298749331099
# Perform variability analysis again
tva_fluxes_prot = variability_analysis(tmodel_prot, kind='reactions')
2021-05-21 17:31:52,185 - thermomodel_ - INFO - Beginning variability analysis for variable of type reactions
minimizing: 100%|██████████| 95/95 [00:40<00:00, 2.35it/s]
maximizing: 100%|██████████| 95/95 [08:00<00:00, 5.06s/it]
Plotting¶
Similar to Figure 5 B of Sánchez et al., 2017.
import plotly.express as px
dfs = (fva_fluxes, tva_fluxes, tva_fluxes_prot)
Tidy up the data to plot it with plotly express.
df_plot = pd.concat([abs(df.maximum - df.minimum).apply(lambda x: x if x < 2000 else 2000) for df in dfs], axis=1).melt(var_name="Variability method", value_name="Flux")
df_plot.loc[df_plot["Variability method"] == 0, "Variability method"] = "FBA"
df_plot.loc[df_plot["Variability method"] == 1, "Variability method"] = "Thermo"
df_plot.loc[df_plot["Variability method"] == 2, "Variability method"] = "Thermo + proteins"
fig = px.histogram(
df_plot,
x="Flux", color="Variability method",
cumulative=True, nbins=200,
color_discrete_sequence=["rgba(26, 200, 26, 0.8)", "rgba(200, 26, 80, 0.5)", "rgba(26, 26, 200, 0.5)"],
marginal="violin", barmode="overlay",
)
fig.show(renderer="notebook_connected")