Thermodynamics integration

Geckopy provides an integration layer with pytfa to run Thermodynamic Flux Analysis with an enzyme constraint model.

from pathlib import Path

import geckopy
import numpy as np
import pandas as pd
import pytfa

from cobra.flux_analysis.variability import flux_variability_analysis
from geckopy.integration.pytfa import (
    adapt_gecko_to_thermo,
    translate_model_mnx_to_seed,
)
from geckopy.experimental import from_copy_number
from geckopy.experimental.relaxation import apply_proteomics_relaxation
from geckopy.experimental.molecular_weights import extract_proteins

from pytfa.io import load_thermoDB
from pytfa.io.plotting import plot_fva_tva_comparison
from pytfa.optim.variables import DeltaG,DeltaGstd,ThermoDisplacement
from pytfa.analysis import  variability_analysis
DATA = Path.cwd().parent / "tests" / "data"

Load the model

Load the enzyme constraint model and transform it to a thermo model (taking into account the proteins).

ec_model = geckopy.io.read_sbml_ec_model(DATA / "ec_coli_core.xml", hardcoded_rev_reactions=False)

We fix the glucose exchange rate so that the FVA runs are more comparable between each other.

ec_model.slim_optimize()
ec_model.reactions.EX_glc__D_e.bounds = ec_model.reactions.EX_glc__D_e.flux, ec_model.reactions.EX_glc__D_e.flux

Geckopy to pytfa adapter. There is a translation of ids although in this case the model already contains seed identifiers.

thermodb = load_thermoDB(DATA / "thermo_data.thermodb")
compartment_data = pytfa.io.read_compartment_data(
    str(DATA / "compartment_data.json")
)
# ranslate the mnx ids in the model to seed ids
translate_model_mnx_to_seed(
    ec_model, thermodb, DATA / "chem_xref_seedset.tsv"
)
tmodel = adapt_gecko_to_thermo(
    ec_model, thermodb, compartment_data, solver="optlang-glpk"
)
tmodel.print_info()
2021-05-21 17:30:42,353 - thermomodel_ - INFO - # Model initialized with units kcal/mol and temperature 298.15 K
2021-05-21 17:30:42,358 - thermomodel_ - INFO - # Model preparation starting...
2021-05-21 17:30:42,406 - thermomodel_ - INFO - # Model preparation done.
2021-05-21 17:30:42,434 - thermomodel_ - INFO - # Model conversion starting...
2021-05-21 17:30:43,134 - thermomodel_ - INFO - # Model conversion done.
2021-05-21 17:30:43,135 - thermomodel_ - INFO - # Updating cobra_model variables...
2021-05-21 17:30:43,139 - thermomodel_ - INFO - # cobra_model variables are up-to-date
                value
key
name
description
num constraints   631
num variables     763
num metabolites    72
num reactions      95
                           value
key
num metabolites(thermo)      127
num reactions(thermo)         73
pct metabolites(thermo)  176.389
pct reactions(thermo)    76.8421

Check that the model works.

solution = tmodel.optimize()
solution
Optimal solution with objective value 0.374
fluxes reduced_costs
PFK 9.627754 None
PFL 0.000000 None
PGI 9.923283 None
PGK -19.005185 None
PGL 0.000000 None
... ... ...
NADH16 30.034558 None
NADTRHD 0.000000 None
NH4t 2.040601 None
O2t 15.017279 None
PDH 16.118563 None

95 rows × 2 columns

Flux variability analysis

Calculate variability analysis on all continuous variables

Plain FVA

We use the FVA from pytfa for consistency here (which requires the model to have and logger property).

Since this model does not have splitted reactions (so that proteins are reactants in both directions), we don’t need to take additional measures to make the different runs comparable. See flux_variability_analysis() that takes that into account if that is not the case.

import logging
# avoid loading the screen with logs
logging.basicConfig(filename='docs03.log', level=logging.DEBUG)
ec_model.logger = logging.getLogger(__name__)
fva_fluxes = variability_analysis(ec_model.copy(), kind="reaction")
minimizing: 100%|██████████| 95/95 [00:00<00:00, 810.72it/s]
maximizing: 100%|██████████| 95/95 [00:00<00:00, 859.12it/s]

Thermo FVA

tva_fluxes = variability_analysis(tmodel, kind="reaction")
2021-05-21 17:30:45,685 - thermomodel_ - INFO - Beginning variability analysis for variable of type reaction
minimizing: 100%|██████████| 95/95 [00:27<00:00,  3.42it/s]
maximizing: 100%|██████████| 95/95 [00:22<00:00,  4.26it/s]

Thermo concentration proteomics FVA

We can apply both proteomics and thermodynamics (which may include metabolomics) constraints.

This will usually require a relaxation of the experimental constraints or the thermodynamic constraint, see the relaxation chapter for a detailed explanation.

raw_proteomics = pd.read_csv(DATA / "ecoli_proteomics_schmidt2016S5.tsv")
ec_model_constrained = from_copy_number(
    ec_model.copy(),
    index=raw_proteomics["uniprot"],
    cell_copies=raw_proteomics["copies_per_cell"],
    stdev=raw_proteomics["stdev"],
    vol=2.3,
    dens=1.105e-12,
    water=0.3,
)
tmodel_prot = adapt_gecko_to_thermo(
    ec_model_constrained, thermodb, compartment_data, solver="optlang-glpk"
)
# constrain the model objective so that the feashibility relaxation recovers growth
tmodel_prot.reactions.BIOMASS_Ecoli_core_w_GAM.lower_bound = solution.objective_value
iis, status = geckopy.integration.relax_thermo_proteins(
    tmodel_prot,
    prot_candidates=[prot.id for prot in tmodel_prot.proteins],
    objective_rule=geckopy.experimental.relaxation.Objective_rule.MIN_ELASTIC_SUM
)
print(f"IIS (status '{status}'): {iis}")
2021-05-21 17:31:36,086 - thermomodel_ - INFO - # Model initialized with units kcal/mol and temperature 298.15 K
2021-05-21 17:31:36,090 - thermomodel_ - INFO - # Model preparation starting...
2021-05-21 17:31:36,138 - thermomodel_ - INFO - # Model preparation done.
2021-05-21 17:31:36,165 - thermomodel_ - INFO - # Model conversion starting...
2021-05-21 17:31:36,867 - thermomodel_ - INFO - # Model conversion done.
2021-05-21 17:31:36,868 - thermomodel_ - INFO - # Updating cobra_model variables...
2021-05-21 17:31:36,871 - thermomodel_ - INFO - # cobra_model variables are up-to-date
adding thermo slacks: 100%|██████████| 73/73 [00:00<00:00, 308.11it/s]
adding protein slacks: 100%|██████████| 55/55 [00:00<00:00, 421.43it/s]
IIS (status 'optimal'): {'prot_P0AB71', 'prot_P00370', 'prot_P0A9P0', 'prot_P0A9N4', 'prot_P25516', 'prot_P21599', 'prot_P0A9C5', 'prot_P0A9B2', 'prot_P06999', 'prot_P37689', 'prot_P06959', 'prot_P0AFG8', 'prot_P0A6P9', 'prot_P33221', 'prot_P0A867'}

Since this relaxation is an in-place operation, we need to first rebuild the model.

In this case, all of the problematic constraints are proteins, so we will simply remove their concentrations for illustration purposes.

ec_model_constrained = from_copy_number(
    ec_model.copy(),
    index=raw_proteomics["uniprot"],
    cell_copies=raw_proteomics["copies_per_cell"],
    stdev=raw_proteomics["stdev"],
    vol=2.3,
    dens=1.105e-12,
    water=0.3,
)
tmodel_prot = adapt_gecko_to_thermo(
    ec_model_constrained, thermodb, compartment_data, solver="optlang-glpk"
)
for prot_id in iis:
    tmodel_prot.proteins.get_by_id(prot_id).add_concentration(None)
2021-05-21 17:31:49,759 - thermomodel_ - INFO - # Model initialized with units kcal/mol and temperature 298.15 K
2021-05-21 17:31:49,763 - thermomodel_ - INFO - # Model preparation starting...
2021-05-21 17:31:49,810 - thermomodel_ - INFO - # Model preparation done.
2021-05-21 17:31:49,838 - thermomodel_ - INFO - # Model conversion starting...
2021-05-21 17:31:50,545 - thermomodel_ - INFO - # Model conversion done.
2021-05-21 17:31:50,546 - thermomodel_ - INFO - # Updating cobra_model variables...
2021-05-21 17:31:50,548 - thermomodel_ - INFO - # cobra_model variables are up-to-date
tmodel_prot.slim_optimize()
0.3742298749331099
# Perform variability analysis again
tva_fluxes_prot = variability_analysis(tmodel_prot, kind='reactions')
2021-05-21 17:31:52,185 - thermomodel_ - INFO - Beginning variability analysis for variable of type reactions
minimizing: 100%|██████████| 95/95 [00:40<00:00,  2.35it/s]
maximizing: 100%|██████████| 95/95 [08:00<00:00,  5.06s/it]

Plotting

Similar to Figure 5 B of Sánchez et al., 2017.

import plotly.express as px
dfs = (fva_fluxes, tva_fluxes, tva_fluxes_prot)

Tidy up the data to plot it with plotly express.

df_plot = pd.concat([abs(df.maximum - df.minimum).apply(lambda x: x if x < 2000 else 2000) for df in dfs], axis=1).melt(var_name="Variability method", value_name="Flux")
df_plot.loc[df_plot["Variability method"] == 0, "Variability method"] = "FBA"
df_plot.loc[df_plot["Variability method"] == 1, "Variability method"] = "Thermo"
df_plot.loc[df_plot["Variability method"] == 2, "Variability method"] = "Thermo + proteins"
fig = px.histogram(
    df_plot,
    x="Flux", color="Variability method",
    cumulative=True, nbins=200,
    color_discrete_sequence=["rgba(26, 200, 26, 0.8)", "rgba(200, 26, 80, 0.5)", "rgba(26, 26, 200, 0.5)"],
    marginal="violin", barmode="overlay",
)
fig.show(renderer="notebook_connected")