## Abstract

The implementation of model-based designs in metabolic engineering and synthetic biology may fail. One of the reasons for this failure is that only a part of the real-world complexity is included in models. Still, some knowledge can be simplified and taken into account in the form of optimization constraints to improve the feasibility of model-based designs of metabolic pathways in organisms. Some constraints (mass balance, energy balance, and steady-state assumption) serve as a basis for many modelling approaches. There are others (total enzyme activity constraint and homeostatic constraint) proposed decades ago, but which are frequently ignored in design development. Several new approaches of cellular analysis have made possible the application of constraints like cell size, surface, and resource balance. Constraints for kinetic and stoichiometric models are grouped according to their applicability preconditions in (1) general constraints, (2) organism-level constraints, and (3) experiment-level constraints. General constraints are universal and are applicable for any system. Organism-level constraints are applicable for biological systems and usually are organism-specific, but these constraints can be applied without information about experimental conditions. To apply experimental-level constraints, peculiarities of the organism and the experimental set-up have to be taken into account to calculate the values of constraints. The limitations of applicability of particular constraints for kinetic and stoichiometric models are addressed.

## Introduction

Mathematical modelling is used in metabolic engineering to predict organism behaviour in response to implementation of the designed changes. Optimization of a model aims to suggest improvements of organisms for application in an industrial environment. A variety of optimization aims can be set defining the objective function: (1) increase in titre, rate, yield, or biomass and (2) increase in substrate consumption in case of bioremediation tasks and (3) more case-specific features that can be calculated from parameters or variables involved in the model. Each model of a biological system is a simplification of reality, taking into account only a part of real-life constraints. Therefore, model predictions used for design purposes in biotechnology may not always be accurate.

Some of the fundamental features (constraints) of nature are successfully used in modelling and serve as a basis for some modelling approaches. Different types of models use various constraints, thus contributing to better predictions. Some constraints are very specific to the process of interest, while others are more universal and can be applied to most of the models. In our review, we centre around the implementation of universal constraints in two popular and very different approaches of metabolism modelling: (1) kinetic (dynamic) modelling where metabolite concentrations and reaction fluxes are modelled as a function of time [1] and (2) stoichiometric modelling where feasibility of steady states are assessed ignoring time and metabolite concentrations [2].

*Kinetic models* require more details. Hence, they usually cover up to some tens of metabolic reactions or transport processes within one or a few pathways [1]. Kinetic models contain information about reaction mechanisms like the Michaelis–Menten reaction, mass action, details of different types of inhibition and others, and their parameters like the catalytic constant (*k*_{cat}), maximal rate of reaction (*V*_{max}), Michaelis–Menten constant (*k*_{m}), and others. This type of models gives opportunity to simulate quantitatively the changes of metabolite concentrations and flux values in time. Some examples of detailed pathway-scale metabolic models are models of glycolysis [3], Entner–Doudoroff pathway [4], and the relatively larger model of central carbon metabolism [5].

*Stoichiometric models* require less details for individual reactions and can be applied at genome scale [2,6,7]. The stoichiometric approach can be used for analysis of feasible steady states (metabolite concentrations are not changing in time) and minimally needs information just about reaction stoichiometry. Optional parameters (constraints) like lower and upper bounds of fluxes, reaction directionality, and others can make the model predictions more accurate. A smaller amount of information per reaction enables the development of large models, but the disadvantage is that stoichiometric models cannot be used to simulate any changes in time and cannot calculate metabolite concentrations. Stoichiometric models have been applied at genome scale for many organisms including *Saccharomyces cerevisiae* [8], *Escherichia coli* [9], and human [10]. Stoichiometric models are useful also at a smaller scale, for instance, concentrating on central carbon metabolism [11,12].

Popular metabolic modelling constraints along with the ones that are just becoming practically applicable are classified according to specificity of their applicability (Figure 1): some constraints can be applied for all models, while others are valid just for a specific organism, or for the analysis of a particular organism under specific experimental conditions.

## General (universal) constraints

There are constraints that function in any system, not just biological ones. The **mass conservation** principle is applied to limit metabolic network behaviour in models [13] and as a basis for both kinetic and stoichiometric modelling approaches. The **energy balance** is another general constraint derived from the law of conservation of energy in an isolated system.

**Steady-state assumption** is an extremely important constraint in kinetic modelling [14]. The steady state of metabolism (concentrations of internal metabolites are constant while at least one flux has non-zero value) is frequently requested in kinetic modelling and optimization looking for biotechnologically applicable strain designs: organisms should be able to sustain the designed process and not just reach it for a moment in a transition phase. Another issue is the **stability of the steady state** that can be set as a constraint: eigenvalues of the Jacobian matrix can have only negative real parts [15]. At the same time, the **duration of this transition process** to a steady state might be very slow. If the transition time exceeds the lifespan of the organism, feasibility of a particular steady state should be analyzed more carefully.

The mass conservation principle combined with the steady-state assumption is applied as the basis for the flux balance analysis approach [16]. This approach enables analysis of large systems with no need for the kinetic details of individual reactions.

In addition, the application of **thermodynamic constraints** limits the direction of reactions [17,18], thus reducing the solution space of a model.

Mass conservation and energy balance constraints serve as the basis for kinetic and stoichiometric modelling while some other constraints have been derived from general rules of mass balance and energy balance. While the steady state is the enabling assumption for stoichiometric modelling, it finds frequent use also in kinetic modelling. In contrast to stoichiometric models, the steady state metabolite concentrations, the stability of steady state and duration of transition period can be assessed only for kinetic ones.

## Organism-level constraints

Organism-level constraints are based on properties that are unique to a specific organism while being consistent for all environmental or experimental conditions. The metabolic network of an organism (determined by DNA sequence) itself can be seen as an organism-level constraint. Organism-level constraints are mostly based on knowledge about physiological limitations and peculiarities of a particular organism or the assumption that the modified organism design is feasible if resources and/or parameters of the existing organism will not be exceeded. No information about experimental conditions is needed to determine values of constraints, as they are applicable for all experimental conditions.

To take into account limited enzyme-building resources, the **total enzyme activity constraint** [19,20] can be implemented by setting limits for the sum of enzyme concentrations without detailed experiment-specific analysis. This type of constraint is based on assumption that the modified organism should be able to produce as much (or a small fraction more) protein as the initial one. The total enzyme activity constraint is used in several kinetic model studies [15,21–24]. The total enzyme activity constraint is implemented also in stoichiometric models [25,26].

The application of steady-state constraints in both kinetic and stoichiometric models enables the synergy between two model types. Most kinetic models are relatively small but include kinetic parameters and take into account metabolite concentrations. The steady-state fluxes found in kinetic models can be put into stoichiometric models as constraints to test the feasibility of the kinetic model steady-state flux distribution at genome scale where mass and energy balance can be taken into account at a larger scale [27]. The range of metabolite concentrations in kinetic models can be also used to calculate lower and upper constraints of reaction fluxes in stoichiometric models where concentrations cannot be directly applied.

The ability of kinetic models to calculate metabolite concentrations enable the application of **metabolite concentration-related constraints**. In most cases, these are organism-specific and not related to particular experimental conditions. Each organism may have specific metabolites that are **cytotoxic** above some concentration that can be used as an upper limit constraint for metabolite concentrations [28]. Similarly, there can be **unrealistic or unfeasible upper or lower levels of metabolite concentration** prohibited as new steady-state concentrations or even as concentrations during the transition process [29].

Another metabolite concentration-related constraint is the **homeostatic** one [30,31] used to limit the impact of large changes of internal metabolite concentration in pathway-scale kinetic models to other reactions outside the model's scope via gene expression, reaction flux, reaction directionality, and other potential effects [15,28]. The homeostatic constraint limits optimized steady-state concentrations of metabolites within some range around the steady-state concentrations of the initial model. It can be applied for (1) a pool of internal metabolites [15,21,24,32], (2) each metabolite separately [23,29,33], or (3) a combination of both [22]. Heavy reduction in objective function values by the homeostatic constraint can take place [23]. It might be useful to choose an acceptable concentration range for each metabolite separately taking into account their degree of involvement in other cellular processes outside of the model's scope [23]. A variety of homeostatic constraints is used by Magnus et al. [15] applying metabolite concentration-dependent Gibbs-free energy calculations to assure feasibility of reactions.

A minimal set of adjustable parametersis used as a design constraint, assuming that it reduces the number of unpredictable potential side effects not included in the model [32–34]. It can be applied both for kinetic and stoichiometric models. In case of kinetic models, the comparison of optimization results for a different number of adjustable parameters may be required [35]. The optimization runs for many adjustable parameter combinations and can be done also in an automated way [36].

### Application example of organism-level constraints

The impact of the total enzyme activity constraint and the homeostatic constraint looking for minimal set of adjustable parameters (total optimization potential (TOP) approach) is demonstrated in detail [23] (Figure 2), optimizing the model to increase sucrose accumulation in sugarcane culm [37]. The aim is to maximize objective function: proportion of sucrose accumulation in the vacuole relative to sucrose hydrolysis by invertase [23]. The best objective function value for a combination of all five adjustable parameters without constraints was 2.6 × 10^{6} (Figure 2B), requesting an unrealistic 1500-fold increase in glucose concentration and a 5-fold increase in enzyme concentrations that are used as adjustable parameters. Introduction of the total enzyme activity constraint did not allow increase in total concentration of enzymes and reduced the objective function value 10-fold to 0.16 × 10^{6} (Figure 2B), still relying on an unrealistic 118-fold increase in fructose concentration. Introducing the homeostatic constraint allowing changes of metabolite concentrations just by ±20% reduced the objective function to 4.7 (Figure 1C), demonstrating a dramatic decrease in the objective function value compared with previous cases. Despite that it brings an 34% increase in the objective function value of the original model. Implementation of both the total enzyme activity and homeostatic constraints (Figure 2D) did not reduce further the objective function value of the full set of five adjustable parameters (concentrations of five enzymes) [23]. This illustrates how highly promising but unrealistic designs are made less promising by the value of objective function but are improved in terms of biological viability by implementing additional constraints. The same study demonstrated also application of the TOP approach, revealing that homeostatic and total enzyme activity constraints heavily influence the rank of best adjustable parameter combinations when a minimal set of adjustable parameters is sought (Figure 2). Applying the homeostatic constraint the objective function increase is smaller, but almost all optimization potential can be reached by manipulating values of just two adjustable parameters (Figure 2C and 2D).

Metabolite concentration related constraints directly can be applied only in kinetic models while total enzyme activity constraint can be implemented also in stoichiometric models. Despite the fact that the above described organism level constraints are proposed and applied several decades ago, they are also frequently ignored. The calculation of constraints for stoichiometric models using kinetic models has not been applied practically, as far as the authors are aware.

## Experiment-level constraints

Experiment-level constraints are environmental condition-dependent constraints, applied to enforce the model to perform the way a particular organism does in these specific conditions. Therefore, these constraints have to be assessed specifically for each experimental condition. Usually, the determination of experimental-level constraints requires more complicated calculations than general or organism-level constraints.

**Biomass composition** (often defined as biomass flux) can be used in stoichiometric models to assess the feasibility of the growth as cells have to be able to produce all the necessary components required for the biomass synthesis from the available substrates. The organism is not viable if any of the biomass components cannot be produced [38]. This feature is applied also for testing gene essentiality.

Commonly, the stoichiometry of biomass flux is treated as a constant, making the biomass composition the same regardless of the conditions. This simplification ignores facts like: (1) faster growing cells are bigger [39], hence the ratio of the cytosolic mass to the total mass of the cell increases; (2) the RNA-to-protein ratio increases in faster growing cells as more ribosomes are needed to produce all the necessary enzymes; and (3) the expression profile of enzymes changes as different enzymes are needed in various conditions.

The subject of changing biomass composition has received relatively small attention from the community of constraint-based modelling. It has been shown that precise biomass composition is a very important parameter when reproducing experimental data, as the use of incorrect biomass composition increased the average error from 17% up to 80% [40]. On the contrary, it has been shown that overall biomass yield is not overly sensitive to changes in the biosynthetic need of any single biomass precursor or cofactor [41]. Biomass yield was most sensitive to the changes in requirements for ATP and NADH. For example, if the need for these components increases 10%, the biomass yield drops only by 2.0% and 1.7%, respectively.

The **cellular resources** (mass, energy, surface, volume, and others) are limited. They depend on the experimental environment and available nutrients. Models incorporating cellular features and resources are needed to determine constraints applicable in different ways both in kinetic and stoichiometric models.

The constraint-based modelling method known as **Resource Balance Analysis (RBA)** [42] links metabolic fluxes to necessary cellular resources required to carry these fluxes. Cellular resources according to RBA are split between translational apparatus, metabolic enzymes, transporters, and house-keeping proteins. It has also been shown that RBA calibrated with the genome-wide absolute quantitative proteome data can predict resource allocation in a *Bacillus subtilis* model for a wide range of growth conditions [43]. Other possibilities of RBA-based methods are reviewed by Goelzer and Fromion [44].

A further step is to constrain the cell mass, membrane area, and doubling time [45]. Enzymatic resources are limited, as (1) the enzymatic apparatus including the ribosomes has to fit inside the cytosolic mass; (2) all macromolecular components of the cell have to be synthesized within one cell cycle; and (3) active transport is limited by the area of a cell membrane as the amount of enzymatic components in the membrane must be limited. These additional constraints improve the feasibility of the model-based predictions and recently have been bound into one model [45]. Cell cycle theory of Cooper–Helmstetter [46] was used to take into account DNA replication as well as cell division times to calculate cell mass and volume, surface area, DNA amount, and cell content. These kinds of calculations can lead to better predictions also in case of radical changes in the proteome creating improved chassis organisms and lean proteome strains [47].

Specific biomass compositions and cellular resources related constraints are complicated to calculate. They can be determined by building separate resources and geometry related models. In spite of demanding calculations, the potential increase of design feasibility may be advantageous.

## Conclusion

There are many opportunities to improve model-based predictions using various types of constraints — from the universal constraints that are applicable independent of the organism to the constraints that are specific for a particular organism and environmental conditions.

While more general rules of nature, like mass and energy balance, serve as a basis for modelling approaches, some organism- and experiment-level constraints may be demanding in the calculations for practical implementation. Recent advances in the modelling of cellular resources along with application of well-established but frequently ignored constraints can contribute to better feasibility of model-based designs.

At the same time, there are well-known facts about particular organisms (cytotoxicity and unfeasible values of parameters) that can be applied and may contribute to the feasibility of designs.

## Author Contribution

E.S. made the structure of the article. E.S. and V.K. contributed to the analysis of constraints for kinetic modelling. A.S., K.P., and A.P. contributed to the analysis of constraints for stoichiometric modelling. All the authors contributed equally to the classification of constraints.

## Funding

This research is partly funded by ERASysAPP — ERA-Net for Systems Biology Applications project ‘Systems biology platform for the creation of lean-proteome *Escherichia coli* strains’ (LEANPROT).

## Competing Interests

The Authors declare that there are no competing interests associated with the manuscript.

**Abbreviations:** RBA, resource balance analysis; TOP, total optimization potential

- © 2018 The Author(s)

This is an open access article published by Portland Press Limited on behalf of the Biochemical Society and distributed under the Creative Commons Attribution License 4.0 (CC BY-NC-ND).