Thermodynamics integration¶

Geckopy provides an integration layer with pytfa to run Thermodynamic Flux Analysis with an enzyme constraint model.

from pathlib import Path

import geckopy
import numpy as np
import pandas as pd
import pytfa

from cobra.flux_analysis.variability import flux_variability_analysis
from geckopy.integration.pytfa import (
    adapt_gecko_to_thermo,
    translate_model_mnx_to_seed,
)
from geckopy.experimental import from_copy_number
from geckopy.experimental.relaxation import apply_proteomics_relaxation
from geckopy.experimental.molecular_weights import extract_proteins

from pytfa.io import load_thermoDB
from pytfa.io.plotting import plot_fva_tva_comparison
from pytfa.optim.variables import DeltaG,DeltaGstd,ThermoDisplacement
from pytfa.analysis import  variability_analysis

DATA = Path.cwd().parent / "tests" / "data"

Load the model¶

Load the enzyme constraint model and transform it to a thermo model (taking into account the proteins).

ec_model = geckopy.io.read_sbml_ec_model(DATA / "ec_coli_core.xml", hardcoded_rev_reactions=False)

We fix the glucose exchange rate so that the FVA runs are more comparable between each other.

ec_model.slim_optimize()
ec_model.reactions.EX_glc__D_e.bounds = ec_model.reactions.EX_glc__D_e.flux, ec_model.reactions.EX_glc__D_e.flux

Geckopy to pytfa adapter. There is a translation of ids although in this case the model already contains seed identifiers.

thermodb = load_thermoDB(DATA / "thermo_data.thermodb")
compartment_data = pytfa.io.read_compartment_data(
    str(DATA / "compartment_data.json")
)
# ranslate the mnx ids in the model to seed ids
translate_model_mnx_to_seed(
    ec_model, thermodb, DATA / "chem_xref_seedset.tsv"
)
tmodel = adapt_gecko_to_thermo(
    ec_model, thermodb, compartment_data, solver="optlang-glpk"
)
tmodel.print_info()

2021-05-21 17:30:42,353 - thermomodel_ - INFO - # Model initialized with units kcal/mol and temperature 298.15 K
2021-05-21 17:30:42,358 - thermomodel_ - INFO - # Model preparation starting...
2021-05-21 17:30:42,406 - thermomodel_ - INFO - # Model preparation done.
2021-05-21 17:30:42,434 - thermomodel_ - INFO - # Model conversion starting...
2021-05-21 17:30:43,134 - thermomodel_ - INFO - # Model conversion done.
2021-05-21 17:30:43,135 - thermomodel_ - INFO - # Updating cobra_model variables...
2021-05-21 17:30:43,139 - thermomodel_ - INFO - # cobra_model variables are up-to-date

                value
key
name
description
num constraints   631
num variables     763
num metabolites    72
num reactions      95
                           value
key
num metabolites(thermo)      127
num reactions(thermo)         73
pct metabolites(thermo)  176.389
pct reactions(thermo)    76.8421

Check that the model works.

solution = tmodel.optimize()
solution

Optimal solution with objective value 0.374

	fluxes	reduced_costs
PFK	9.627754	None
PFL	0.000000	None
PGI	9.923283	None
PGK	-19.005185	None
PGL	0.000000	None
...	...	...
NADH16	30.034558	None
NADTRHD	0.000000	None
NH4t	2.040601	None
O2t	15.017279	None
PDH	16.118563	None

95 rows × 2 columns

Flux variability analysis¶

Calculate variability analysis on all continuous variables

Plain FVA¶

We use the FVA from pytfa for consistency here (which requires the model to have and logger property).

Since this model does not have splitted reactions (so that proteins are reactants in both directions), we don’t need to take additional measures to make the different runs comparable. See flux_variability_analysis() that takes that into account if that is not the case.

import logging

# avoid loading the screen with logs
logging.basicConfig(filename='docs03.log', level=logging.DEBUG)
ec_model.logger = logging.getLogger(__name__)
fva_fluxes = variability_analysis(ec_model.copy(), kind="reaction")

minimizing: 100%|██████████| 95/95 [00:00<00:00, 810.72it/s]
maximizing: 100%|██████████| 95/95 [00:00<00:00, 859.12it/s]

Thermo FVA¶

tva_fluxes = variability_analysis(tmodel, kind="reaction")

2021-05-21 17:30:45,685 - thermomodel_ - INFO - Beginning variability analysis for variable of type reaction
minimizing: 100%|██████████| 95/95 [00:27<00:00,  3.42it/s]
maximizing: 100%|██████████| 95/95 [00:22<00:00,  4.26it/s]

Thermo concentration proteomics FVA¶

We can apply both proteomics and thermodynamics (which may include metabolomics) constraints.

This will usually require a relaxation of the experimental constraints or the thermodynamic constraint, see the relaxation chapter for a detailed explanation.

raw_proteomics = pd.read_csv(DATA / "ecoli_proteomics_schmidt2016S5.tsv")
ec_model_constrained = from_copy_number(
    ec_model.copy(),
    index=raw_proteomics["uniprot"],
    cell_copies=raw_proteomics["copies_per_cell"],
    stdev=raw_proteomics["stdev"],
    vol=2.3,
    dens=1.105e-12,
    water=0.3,
)
tmodel_prot = adapt_gecko_to_thermo(
    ec_model_constrained, thermodb, compartment_data, solver="optlang-glpk"
)
# constrain the model objective so that the feashibility relaxation recovers growth
tmodel_prot.reactions.BIOMASS_Ecoli_core_w_GAM.lower_bound = solution.objective_value
iis, status = geckopy.integration.relax_thermo_proteins(
    tmodel_prot,
    prot_candidates=[prot.id for prot in tmodel_prot.proteins],
    objective_rule=geckopy.experimental.relaxation.Objective_rule.MIN_ELASTIC_SUM
)
print(f"IIS (status '{status}'): {iis}")

2021-05-21 17:31:36,086 - thermomodel_ - INFO - # Model initialized with units kcal/mol and temperature 298.15 K
2021-05-21 17:31:36,090 - thermomodel_ - INFO - # Model preparation starting...
2021-05-21 17:31:36,138 - thermomodel_ - INFO - # Model preparation done.
2021-05-21 17:31:36,165 - thermomodel_ - INFO - # Model conversion starting...
2021-05-21 17:31:36,867 - thermomodel_ - INFO - # Model conversion done.
2021-05-21 17:31:36,868 - thermomodel_ - INFO - # Updating cobra_model variables...
2021-05-21 17:31:36,871 - thermomodel_ - INFO - # cobra_model variables are up-to-date
adding thermo slacks: 100%|██████████| 73/73 [00:00<00:00, 308.11it/s]
adding protein slacks: 100%|██████████| 55/55 [00:00<00:00, 421.43it/s]

IIS (status 'optimal'): {'prot_P0AB71', 'prot_P00370', 'prot_P0A9P0', 'prot_P0A9N4', 'prot_P25516', 'prot_P21599', 'prot_P0A9C5', 'prot_P0A9B2', 'prot_P06999', 'prot_P37689', 'prot_P06959', 'prot_P0AFG8', 'prot_P0A6P9', 'prot_P33221', 'prot_P0A867'}

Since this relaxation is an in-place operation, we need to first rebuild the model.

In this case, all of the problematic constraints are proteins, so we will simply remove their concentrations for illustration purposes.

ec_model_constrained = from_copy_number(
    ec_model.copy(),
    index=raw_proteomics["uniprot"],
    cell_copies=raw_proteomics["copies_per_cell"],
    stdev=raw_proteomics["stdev"],
    vol=2.3,
    dens=1.105e-12,
    water=0.3,
)
tmodel_prot = adapt_gecko_to_thermo(
    ec_model_constrained, thermodb, compartment_data, solver="optlang-glpk"
)
for prot_id in iis:
    tmodel_prot.proteins.get_by_id(prot_id).add_concentration(None)

2021-05-21 17:31:49,759 - thermomodel_ - INFO - # Model initialized with units kcal/mol and temperature 298.15 K
2021-05-21 17:31:49,763 - thermomodel_ - INFO - # Model preparation starting...
2021-05-21 17:31:49,810 - thermomodel_ - INFO - # Model preparation done.
2021-05-21 17:31:49,838 - thermomodel_ - INFO - # Model conversion starting...
2021-05-21 17:31:50,545 - thermomodel_ - INFO - # Model conversion done.
2021-05-21 17:31:50,546 - thermomodel_ - INFO - # Updating cobra_model variables...
2021-05-21 17:31:50,548 - thermomodel_ - INFO - # cobra_model variables are up-to-date

tmodel_prot.slim_optimize()

0.3742298749331099

# Perform variability analysis again
tva_fluxes_prot = variability_analysis(tmodel_prot, kind='reactions')

2021-05-21 17:31:52,185 - thermomodel_ - INFO - Beginning variability analysis for variable of type reactions
minimizing: 100%|██████████| 95/95 [00:40<00:00,  2.35it/s]
maximizing: 100%|██████████| 95/95 [08:00<00:00,  5.06s/it]

Plotting¶

import plotly.express as px

dfs = (fva_fluxes, tva_fluxes, tva_fluxes_prot)

Tidy up the data to plot it with plotly express.

df_plot = pd.concat([abs(df.maximum - df.minimum).apply(lambda x: x if x < 2000 else 2000) for df in dfs], axis=1).melt(var_name="Variability method", value_name="Flux")
df_plot.loc[df_plot["Variability method"] == 0, "Variability method"] = "FBA"
df_plot.loc[df_plot["Variability method"] == 1, "Variability method"] = "Thermo"
df_plot.loc[df_plot["Variability method"] == 2, "Variability method"] = "Thermo + proteins"

fig = px.histogram(
    df_plot,
    x="Flux", color="Variability method",
    cumulative=True, nbins=200,
    color_discrete_sequence=["rgba(26, 200, 26, 0.8)", "rgba(200, 26, 80, 0.5)", "rgba(26, 26, 200, 0.5)"],
    marginal="violin", barmode="overlay",
)
fig.show(renderer="notebook_connected")