What is SBML?

Fruex → Fru
GLCex → Glc
ATP + Glc → ADP + HexP
ATP + Fru → ADP + HexP
2 HexP → Suc6P + UDP
Suc6P → phos + Suc
Fru + HexP → Suc + UDP
Suc → Fru + Glc
HexP → glycolysis
Suc → Sucvac
Can you predict what a set of reactions like this will do when you start the system with different initial quantities?

The starting point is an appreciation that computational modeling of biological systems is no longer a fringe activity—it’s a requirement for us to make sense of our vast and ever-expanding quantities of data. This reality is acknowledged and reinforced by a vast increase over the past two decades in the number of journals, books and articles having computational and systems biology emphases.

At its most basic, computational modeling is no different from modeling as it’s practiced by all scientists, whether in biology or elsewhere. The extra but crucial step is casting the model into a formal, computable form that can be analyzed rigorously using simulation and other mathematical methods.

Different representations of models are useful for different purposes. Graphical diagrams of biological processes are useful for visual presentation to humans, but at the level of software, a different format is needed for quantifying a model to the point where it can be simulated and analyzed. That’s where the Systems Biology Markup Language (SBML) comes in.

Simply put, SBML is a machine-readable format for representing models. It’s oriented towards describing systems where biological entities are involved in, and modified by, processes that occur over time. An example of this is a network of biochemical reactions. SBML’s framework is suitable for representing models commonly found in research on a number of topics, including cell signaling pathways, metabolic pathways, biochemical reactions, gene regulation, and many others.

SBML is for software

SBML does not represent an attempt to define a universal language for representing quantitative models. It would be impossible to achieve a one-size-fits-all universal language. A more realistic alternative is to acknowledge the diversity of approaches and methods being explored in systems biology, and seek a common intermediate format—a lingua franca—enabling communication of the most essential aspects of the models.

SBML is neutral with respect to programming languages and software encoding; however, it’s oriented towards allowing models to be encoded using XML. By supporting SBML as a format for reading and writing models, different software tools (including programs for building and editing models, simulation programs, databases, and other systems) can directly communicate and store the same computable representation of those models. This removes an impediment to sharing results and permits other researchers to start with an unambiguous representation of the model, examine it carefully, propose precise corrections and extensions, and apply new techniques and approaches—in short, to do better science.

SBML is for people too

SBML enables research teams to use a single model description throughout a project’s life cycle even when projects involve heterogeneous software tools. An ecosystem of SBML-compatible software tools today allows researchers to use SBML in all aspects of a modeling project, including creation (manual or automated), annotation, comparison, merging, parametrization, simulation/analysis, results comparison, network motif discovery, system identification, omics data integration, visualization, and more. Such use of a standardized format, along with standard annotation schemes and training in reproducible methods, improves research workflows and is generally recognized as promoting research reproducibility.

The adoption of SBML offers many benefits, including: (1) enabling the use of multiple tools without rewriting models for each tool, (2) enabling models to be shared and published in a form other researchers can use even in a different software environment, and (3) ensuring the survival of models (and the intellectual effort put into them) beyond the lifetime of the software used to create them.

What can you do with it?

If you’re a biologist interested in doing computational modeling, this may be all you need to know about SBML. Today’s modern software packages hide the details of SBML and provide you with interfaces that help you focus on your modeling and analysis tasks.

If you’re a software developer or an advanced modeler, you probably want to learn just a little bit more about SBML. Please go to the SBML specification documents.

Evolution and growth of SBML

SBML development has not stopped—it’s an active area of work today. The SBML Development_Process defines the community-oriented development approach. We welcome you to get involved!

The development of SBML is stratified in order to organize architectural changes and versioning. Major editions of SBML are termed Levels and represent substantial changes to the composition and structure of the language. Models defined in lower Levels of SBML can always be represented in higher Levels, though some translation may be necessary. The converse (from higher Level to lower Level) is sometimes also possible, though not guaranteed. The Levels remain distinct; a valid SBML Level 1 document is not a valid SBML Level 2 document. Minor revisions of SBML are termed Versions and constitute changes within a level to correct, adjust, and refine language features. Finally, specification documents inevitably require minor editorial changes as its users discover errors and ambiguities. Such problems are corrected in new Releases of a given SBML specification.

The latest generation of SBML, which is Level 3, is modular in the sense of having a defined core set of features and optional packages adding features on top of the core. This modular approach means that models can declare which feature-sets they use, and likewise, software tools can declare which packages they support. It also means that the development of SBML Level 3 can proceed in a modular fashion. The development process for Level 3 is designed around this concept. SBML Level 3 package development is today ongoing activity, with packages being created to extend SBML in many areas that its core functionality does not directly support. Examples include models whose species have structure and/or state variables, models with spatially nonhomogeneous compartments and spatially dependent processes, and models in which species and processes refer to qualitative entities and processes rather than quantitative ones.