FAQ

Date of last content update: 2010-7-15

This Frequently Asked Questions (FAQ) document answers to some frequent questions about the Systems Biology Markup Language (SBML). This is a non-normative document that does not define any aspect of SBML; rather, it is intended to provide additional information in an easily accessible and readable form.

Contents


FAQ overview

What is this FAQ?

This Frequently Asked Questions (FAQ) document is an attempt to answer common questions on the concepts, structure, and other matters of the Systems Biology Markup Language (SBML). It is in addition to the official SBML specifications. This is not a normative document—it does not define anything about SBML.

Who maintains this FAQ?

The SBML Editors do nearly all of the writing and maintaining of this FAQ.

Can I help?

Ohmygawd, yes. Please use the issue tracker (preferred) or email the editors directly (if you really don't like the tracker) if you have any suggestions. This includes:

  • corrections,
  • questions that you would like to see anwered,
  • questions that you know the answer to and would like to see included,
  • suggestions for how to improve the FAQ,
  • praise.

We really do want your input.

General questions about SBML

What is SBML?

The short answer is this: the Systems Biology Markup Language (SBML) is a machine-readable exchange format for computational models of biological processes. Its strength is in representating phenomena at the scale of biochemical reactions, but it is not limited to that. By supporting SBML as an input and output format, different software tools can operate on the same representation of a model, removing chances for errors in translation and assuring a common starting point for analyses and simulations.

A slightly longer (but still relatively short) answer can be found in the separate Basic Introduction to SBML.

Is SBML free?

Yes. Totally.

Is it open?

Yes, in the sense that there are no restrictions on its use, anyone may contribute proposals, anyone may participate in discussions, there's no secret handshake to get into the meetings, etc.

What does SBML look like?

Ugly. Don't look at it, unless you're developing software, in which case, you have to look at it, and we feel for you. SBML is really not meant to be edited by hand or exposed to users.

But I really want to see an example of SBML

You're persistent, aren't you? Alright, then, have a look at the More Detailed Summary of SBML. Don't blame us if it hurts your eyes, though. We warned you, ok?

How can I find out about the model in an SBML file? It's too hard to read the XML

That's what we keep saying. As a first step, you can use the SBML2LaTeX web service to generate a report summarizing the content of an SBML file. SBML2LaTeX generates output in PDF, TeX and other formats, and provides a detailed, human-readable summary of every part of an SBML model (including the system of equations implied by the model). This nice system allows you to understand what an SBML model is about without having to look at the actual XML content. It's a good debugging tool, too. There is a link to that system from the sbml.org facilities page. SBML2LaTeX was developed by Andreas Dräger, Hannes Planatscher, Dieudonné M. Wouamba and Adrian Schröder, and the web service is kindly provided by the University of Tübingen, Germany, as a service to the SBML community.

What kind of models can you represent in SBML?

This question is difficult to answer directly. One way to get a sense for what can be represented in SBML is to look at the kinds of models that have been represented in SBML. A good starting place for that is BioModels Database.

However, a lot depends on how a modeler chooses to express a model. A common abstraction used when describing cellular phenomena is to describe the system as a set of chemical entities linked by processes (reactions) that can transform one entity into another or transport entities between compartments. A compartment in SBML is a location having a defined size or extent (which may be in terms of volume, area, length, or a point). Every chemical species in an SBML model must be located in a compartment. It is worth noting that compartments do not have to map one-to-one to biological structures; compartments can be conceptual too. But SBML is by no mean limited to encoding biochemical reactions. One can encode any mathematical rule linking quantitative characteristics of the biological system, including, but not limited to, electrical behaviour, growth etc. SBML can also describe discrete events that are triggered by state changes in the modeled system.

For developers working implementing SBML support, or modelers working directly with SBML, it is worth noting that while SBML's data structures are things called "species", "reaction", etc., and people often talk about SBML in those terms (even the specifications historically have done that), SBML is not limited to biochemical species and biochemical reactions. Ed Frank put it nicely in a discussion in 2005:

Most software systems do not have software entities that are one-to-one with the problem domain. Software doesn't work that way. It's almost like the problem domain and software domain are fourier transforms of each other. The problem domain has a bunch of things to be worked on or solved. The software domain has objects and object interrelations that focus on encapsulation, robustness, and extensibility. Not the same! Often systems start off with software entities that look a great deal like the problem domain, e.g., species, modifiers, etc. but in time you discover really they are a bit different and migration-cost pressures keep you from renaming them.

Does SBML have units?

Absolutely, yes. Every quantity in an SBML model can have units of measurement associated with it.

How is SBML different from BioPAX?

While BioPAX is meant to facilitate the exchange of biological pathways, SBML has been designed to facilitate exchange and reuse of quantitative models, not necessarily limited to the biochemical events. SBML models contain information about sizes, amounts and kinetics, that cannot be expressed with BioPAX. Conversely, BioPAX being an ontology, one can define much more precisely the identity of the objects considered, whether physical entities or biochemical events. In SBML, this information may be encoded using annotation with terms from the Systems Biology Ontology . Although SBML and BioPAX do not fulfill the same purpose, it is nevertheless possible to convert one into another. Examples of tools providing this service are BiNoM and BioModels Database.

How is SBML different from CellML?

CellML is another format to encode quantitative models, based on XML like SBML. CellML is being developed by the Bioengineering Institute at the University of Auckland and collaborating groups. The chief differences between CellML and SBML can be perhaps described in the following ways. While a model encoded in SBML is based on the successive, hierachical, declarations of model constituents, a CellML model is built as a network of components. A component can contain variables, mathematical expressions, metadata etc. In CellML, the biological information is entirely stored in metadata rather than the language elements. In SBML, the language elements were more directly influenced by present-day biochemical network simulation software, and the mathematical expressions are more constrained than what is permitted in CellML's subset of MathML.

Although SBML and CellML cannot be fully interconverted at the moment, it is nevertheless sometimes possible. Examples of tools providing this service can be found at http://www.ebi.ac.uk/compneur-srv/sbml/converters/.

Is SBML just an XML format?

Yes and no. The primary encoding of SBML is indeed XML, a popular text-based language for expressing structured data in a generic fashion. However, a design goal of SBML has always been to define it in terms of a language-independent formalism (specifically, using UML) and then map that to XML, so that mappings to other formats may be easier.

Isn't SBML too complicated to write?

Don't write SBML by hand. Instead, use software tools that provide higher-level interfaces to reading, writing, and manipulating SBML. Some provide graphical user interfaces, while others provide textual interfaces where you can write models in terms of chemical reactions. Take a look at our SBML Software Guide for help finding a tool that may be suitable for your needs.

Where is SBML defined?

The Systems Biology Markup Language is formally defined in the specification documents.

Where can I find some already-written, working SBML models?

BioModels Database provides a database of hundreds of published models in SBML format. The models in the database have been checked by humans to correspond to the publication and have been annotated with links to other data resources to make searching easier.

I've developed a model and I'd like to make it public; where can I submit it?

You can submit published models to BioModels Database. Please consult the FAQ for BioModels Database and then the submission page for BioModels Database.

What are the SBML Levels?

Levels in SBML are a way of managing complexity in the continued evolution and enhancement of the language. SBML is being developed in a series of levels, where each level adds new features and fixes problems with the previous level. The lowest-numbered levels provide fundamental features that are common to all biochemical network models. Higher-numbered levels add more features that are specific to particular classes of tools. Any level can be used as a standard for interchanging models.

What are the differences between Levels 1 and 2?

The changes in SBML Level 2 include: replacing SBML Level 1's text-string based format for mathematical expressions with a subset of MathML, introducing support for metadata, introducing support for named function definitions, introducing explicit modifier species such as catalysts in reactions, and introducing new constructs for discrete events and time delays. In Version 2 of Level 2, additional major changes include new constructs for types of species, types of compartments, initial assignments, constraints, and a standard approach for annotating model components with cross-references to terms from ontologies and controlled vocabularies. In Version 3 of Level 2, a number of small but important corrections were introduced, the consistency of the unit system was improved, and the UML notation in the specification document was much improved in clarity. In Version 4 of Level 2, the requirement for unit consistency was removed to comply with a community vote held in 2007, and in addition, a few other restrictions were removed on component ordering in a model, and finally, a number of small corrections and changes were made to the RDF and SBO aspects.

What are the SBML Levels/Versions/Releases/Revisions about?

As mentioned above, the Levels of SBML represent different stratifications of functionality and complexity in the SBML language. Major architectural changes are only made from level to level.

Real-world experiences with a language definition often lead to new realizations and the identification of problems. In SBML, we adopted a scheme of Versions within levels. Continued refinements and corrections to an SBML Level take place by issuing new Versions. This is why there is an SBML Level 2 Version 1, an SBML Level 2 Version 2, etc.

Within versions, we needed a scheme for handling editorial corrections that do not affect the intended syntax and semantics of an SBML Level+Version specification. This was initiated in SBML Level 2 Version 2 with the introduction of "Revisions". Unfortunately, the term "Revision" caused too much confusion, so we changed the terminology to use the term "Releases" in SBML Level 2 Version 3. The result is that specifications of SBML are now given in terms of a Level, Version, and Release.

SBML Levels are intended to coexist. For example, SBML Level 2 does not render Level 1 obsolete, and Level 1-compatible models and software tools still continue to be used. However, the matter is different within Versions. Changes between Versions within a Level represent important improvements (and in some cases, critical corrections). Consequently, we strongly encourage software developers and modelers to update software and models to conform to the latest Version within an SBML Level.

Why is Level 1 still being kept around if Level 2 exists?

There exist tools that either were developed before the creation of SBML Level 2 or for which Level 1 is more appropriate. SBML Level 1 therefore continues to have relevance even with the existence of Level 2.

Note that since all Level 1 models can be translated to SBML Level 2, tools that read SBML Level 2 can be made to support Level 1 reasonably easily. Moreover, the availability of libSBML makes it much easier for application developers to support different SBML levels in software applications. Among other features, libSBML has a built-in Level 1 to Level 2 translation facility.

What is SBML Level 3? Image:Updated.gif

SBML Level 3 is the next step up in capability from SBML Level 2. It is being designed as a modular language, with a defined core set of features (to be based largely on SBML Level 2) and topic-specific packages layered on top of the core. This modular approach means that models can declare which feature-sets they use, and likewise, software tools can inform users which packages they support.

The Release 1 candidate of the SBML Level 3 Version 1 Core specification was issued on 31 December, 2009. The community wiki on this web site describes the ongoing SBML Level 3 work on packages.

Why has SBML Level 3 taken so long to define?

SBML Level 3 actually has a long history. It was originally intended to be SBML Level 2, and was discussed as such at workshops through 2002. At the 5th SBML workshop, a consensus emerged on the following points: (1) the planned changes between SBML Level 1 and the then-called Level 2 were too great, (2) a smaller step should be taken to introduce the use of MathML and other changes in SBML, and (3) the then-planned Level 2 features should be pushed to SBML Level 3.

Originally, the Caltech ERATO Team lead the organization of SBML and wrote the final SBML specifications, but they were also involved in developing the Systems Biology Workbench (and indeed, this was their primary objective at the time), which meant that their time was not spent entirely on SBML work. In the late 2002 to 2003 time period, there were (in short order) personnel changes in the team, a need to write grant proposals to support further SBML work, and an explosion in interest in SBML (which led to time spent on more discussions and an increase in the ambitions for Level 3). In the 2003–2004 time frame, the software developer community discovered that the original formulation of SBML Level 2 Version 1 not only required substantial effort to support, but had various limitations and problems, and they requested a slow-down in the introduction of more SBML changes. In response to this, SBML development shifted to SBML Level 2 Version 2 and some associated projects, notably MIRIAM and SBO. It was not until 2006–2007 that several software systems with nearly-complete SBML Level 2 support were introduced by the SBML community, and a sense emerged that it was time to restart Level 3 efforts.

Are there tutorials about SBML?

The SBML Team occasionally puts on tutorials at conferences such as the International Conference on Systems Biology (ICSB), as well as topic-specific tutorials at SBML workshops such as the SBML Hackathons. Please check the Events page and the News page on SBML.org for information about possible upcoming events. Slides and other materials are available online on the SBML.org website.

What is the MIME type for SBML?

The MIME media subtype for SBML is application/sbml+xml and it is defined by RFC 3823 . The goal of defining a MIME type for SBML is to enable applications to recognize files and data streams as being in SBML format by virtue of being tagged with the SBML MIME type.

Is there an official logo for SBML?

Yes! Please see The_SBML_Logos_and_Policies_for_Their_Use page. These logos are ideal for putting in web pages, software documentation, and presentations to show your and your software's support of SBML.

Why does the logo resemble the SBGN and SBO logos? Are the projects linked?

Back in the earliest days of SBML and SBW, Hiroaki Kitano anticipated a series of standards for systems biology. He outsourced the creation of an "SB" logo to a designer at the Sony Corporation Design Center in Tokyo, with the intension of using the invariant "SB" part for a variety of efforts. Initially, SBML and the SBI webpages used this logo, and when SBGN and SBO were started, these also ended up using the same logo beginning.

The different projects have some coordination (mainly by virtue of involving a lot of the same people), but there's no tight coupling or official mandate. SBO and SBGN are meant to be useful to everyone, not only the SBML community.

What papers should I cite if I use SBML?

The single best paper to cite at this time is the 2003 paper in Bioinformatics, even though it describes only Level 1 and not the latest Levels/Versions of SBML:

Hucka, M., Finney, A., Sauro, H. M., Bolouri, H., Doyle, J. C., Kitano, H., Arkin, A. P., Bornstein, B. J., Bray, D., Cornish-Bowden, A. , Cuellar, A. A., Dronov, S., Gilles, E. D., Ginkel, M., Gor, V., Goryanin, I. I., Hedley, W. J., Hodgman, T. C., Hofmeyr, J.-H., Hunter, P. J., Juty, N. S., Kasberger, J. L., Kremling, A., Kummer, U., Le Novère, N., Loew, L. M., Lucio, D., Mendes, P., Minch, E., Mjolsness, E. D., Nakayama, Y., Nelson, M. R., Nielsen, P. F., Sakurada, T., Schaff, J. C., Shapiro, B. E., Shimizu, T. S., Spence, H. D., Stelling, J., Takahashi, K., Tomita, M., Wagner, J., Wang, J. (2003). The Systems Biology Markup Language (SBML): A medium for representation and exchange of biochemical network models. Bioinformatics, vol. 19, no. 4, pp. 524–531.

In addition, it would be appropriate to cite the following short paper, which discusses Level 2:

Finney, A., and Hucka, M. (2003). Systems Biology Markup Language: Level 2 and Beyond. Biochemical Society Transactions, vol. 31, part 6.

There are other papers that would be appropriate to cite in other contexts, such as the use of libSBML or the SBML Layout proposal. The relevant pages on this site or in other people's project sites provide information about the publications that describe the work. (These days, there are too many to keep a reliable up-to-date list here.)

Questions about software support of SBML

Is there a list of software packages that support SBML?

The list of tools supporting SBML has grown to over 180 at the time of this writing. It is no longer feasible to maintain a list in this FAQ. Instead, we refer readers to the SBML Software Guide, which provides links to known software packages supporting SBML. The guide provides both a compact feature matrix as well as a longer annotated overview.

Where can I find certified SBML software?

There is no certification process for software today. As of this writing (5 May 2008), the SBML Team is hard at work on a comprehensive SBML Test Suite. This Suite will make it possible to test SBML support objectively and will help assess the degree of SBML support in different software packages.

However, it is unlikely there will ever be a full "certification" mechanism for SBML. The development and support of SBML is funded primarily by government grants, and we simply do not have the resources it would take to run a true certification process of the sort common in industry.

Are software libraries available for programming with SBML?

Yes. The SBML Software Guide includes information about known libraries for programming SBML support. The SBML Team itself has developed 3 free and open-source packages that can be used to support SBML in different environments:

  • libSBML is a portable, embeddable API library providing language interfaces for C, C++, Java, Lisp, MATLAB, Octave, Perl, Python, and Ruby. It runs on Linux, MacOS and Windows. There is a recent paper describing libSBML too.

Which "Level" of SBML should I use in my software?

We recommend supporting the highest SBML Level that your software can support, because higher levels tend to fix design problems in lower levels of SBML. However, if your software cannot support some of the features of a higher Level of SBML, a lower SBML Level may be more suitable. Note that within Levels of SBML, you should always support the highest Version of the specification for that Level.

What if I can't encode some feature that my software has?

You can try storing the data in SBML's <annotation> elements. These are described in some detail in the SBML Level 1 and Level 2 specification documents. The <annotation> elements can be enclosed within any SBML element and can contain elements of any namespace. Note that data stored in annotations should not contain data that could be or is encoded already in SBML.

How should I structure annotations?

The annotation data enclosed in a specific SBML element is assumed by other applications to be directly associated with that specific element. Therefore, it is important to decompose and locate annotation data appropriately in an SBML document. Avoid encoding all your annotations in a single top-level attribute. The data associated with, for example, an individual species in a model should be encoded in the <annotation> element enclosed within the SBML <species> element representing that species in the SBML file.

The SBML Level 2 specification is the most complete source of information about the syntax of annotations in SBML, but it is a complex topic, and certain questions are not addressed directly in the specifications. We therefore provide a separate page with further explanations about annotations in SBML Level 2.

How should I include database identifiers such as ChEBI identifiers?

Annotations involving database identifiers can be created using the scheme described in Section 6 of the SBML Level 2 Version 4 specification. The approach involves using RDF annotations and specific BioModel elements and qualifiers detailed in the SBML specification. You can find examples of models using this approach in BioModels Database.

What should my software do when it encounters incorrect SBML?

Although an application can't be expected to detect all possible errors in an SBML document, it should do as much as it can to detect errors of syntax and self-consistency. Such errors indicate that something is clearly wrong and that whatever (or whoever) wrote the model made an error. You may want to double-check the validity of the model by testing it with the online SBML Validator. If the SBML file fails, the model should be rejected because it cannot be used as-is. (Incidentally, if you encounter consistent differences between an SBML specification and a software package that claims to be compliant with that specification, please report this to the sbml-interoperability mailing list.)

Detecting and handling incorrect SBML is different from detecting and handling an invalid model encoded in SBML.

How can I test whether I've implemented SBML support as intended?

The SBML Test Suite will provide a large set of input files and corresponding results, and allow you to test your software's implementation of SBML handling. The SBML Test Suite is currently under development and we expect to release iti publicly in August 2008.

Questions about SBML features and their use

Why are non-biochemical features such as explicit equations included in SBML?

The aim of SBML is to enable the construction of quantitative models that describe both the activity of biochemical networks and interaction of biochemical networks and other phenomena. SBML allows the declaration of variables (non-constant parameters) and associated ODEs and DAEs to describe these phenomena. Examples of these phenomena include the mechanical force generated by muscle cells and the electrical potential across a synapse.

Why use MathML? It's much more complicated than text strings

Here is a partial list of motivations for why the switch to MathML was made in SBML Level 2, in no particular order:

  • The list of operators available in the text-string formula notation of Level 1 was judged to be limited. People wanted to expand the mathematical vocabulary to include additional functions (both built-in and user-defined), mathematical constants, logical operators, relational operators and a special symbol to represent time. Rather than growing the simple C-like syntax of Level 1 into something more complicated and esoteric in order to support these features, and consequently having to manage two standards in two different formats (XML and text string formulas), we chose to leverage an existing standard for expressing mathematical formulas in Level 2: the content portion of MathML.
  • There is no standard text-string formula syntax to choose from. The notation in Level 1 was inspired by C, but as many people have pointed out repeatedly, there are differences, and these differences need to dealt with by software tools parsing the infix notation. Thus, this particular problem exists no matter what notation/encoding you choose—the infix text-string notation didn't offer an advantage in this particular regard. Now imagine if we had to grow the syntax to accommodate more operators, user-defined functions, etc. Even more people would complain about differences due to a non-standard mathematical syntax.
  • Related to the above: using MathML means we can avoid having to define reserved words for various language features, such as the time symbol and the delay function. MathML has a mechanism for introducing special terms and operators without having to define new identifiers in the language. Without MathML, we would have had to choose arbitrarily an identifier for each of those quantities, and every new one that was deemed important in the future. Parsing and generating expressions using these identifiers would be problematic in tools that used different built-in symbol values (for example, if a tool uses 't' instead of 'time' for the time symbol).
  • Using MathML allows us to extend SBML without introducing new non-XML syntax. For example if we wanted to introduce some form of modularity we might want a '.' operator in expressions to reference components of submodel instances. We could agree on the introduction of a MathML operator to do this which would be tool neutral rather than again creating an arbitrary syntax, that tools would have to parse, which may or may not be similar to that used within the tools.
  • Whether you parse formulas written as text strings, or parse formulas written as MathML, your software still needs to build up expression trees. Once that's done, there is in principle not much difference between the two.
  • MathML is proper XML, which means that tools using XML parsers can work with it directly. Authors do not have to write a different kind of parser for the text-string infix syntax; they can use a generic XML parser if they wish. Further, libraries specialized for MathML could be used by software developers, possibly saving development time and effort. (Of course, the use of libSBML isolates software tools from all this even further.)
  • Making SBML all-XML means that SBML is more amenable to tools that can process, manipulate and store XML, such as (e.g.) XSLT, XQuery, XPath, and other XML technologies. To give an example of the power of this, it has made it possible to write XSLT transformations to take CellML 1.1 to SBML Level 2. It would have been difficult to construct text-string formulas from CellML reaction definitions using XSLT transformations.

All that said, there are some disadvantages to using MathML in SBML. One is that by introducing MathML part-way through the evolution of SBML, we have created a legacy support problem by having two formula representations with which to contend and interconvert. Another is that people perceive MathML to require greater effort to support, but whether this is true in practice depends on the underlying system. For some applications, it is actually easier to parse and handle MathML than a text-string representation of mathematical formulas, because the MathML expression structure is already made explicit and can be read using available XML software.

The SBML notion of a species seems peculiar, doesn't it?

Well, no, or yes, depending on your definition of "peculiar".

The SBML construct called species represents a pool, that is, a set of "things" that are treated as being indistinguishable from the standpoint of the processes (reactions) in which they participate. When the "same" species (a chemical or other thing) is present in different compartments, each must be treated as a different pool. The reason for this is because the concentrations or partial pressures being different in the various compartments means that the chemical activities are different as well. Also, the pH of different compartments being different, the electrochemical properties of a given chemical entity could be different (think about an enzyme in the cytosol and a lysosome). Analytical software will therefore have to construct different state variables for the different pools, even if the pools contain the same kind of "thing". This is actually a common concept in biochemical simulation, dating back to some of the earliest simulation software.

If you need to express a link between species with different identifiers, you can use the species type construct available since SBML Level 2 Version 2.

Can I have two species with the same name attribute value?

Yes, this is perfectly legal SBML. Of course, you would only want to do that if the species are actually the same conceptual type of entity—you wouldn't want to give the same names to, say, glucose-6-phosphate and ATP in a model, because it wouldn't make any sense.

Species and compartment identifiers in SBML refer to "things" that can participate in dynamical behaviors, but each identifier does not have to refer to a single unique entity. It is possible that the same conceptual entity appears in multiple contexts in a model. Since a species must be given a unique identifier in each compartment in which it appears (see the answer to the previous question for an explanation of why), it is convenient to give the species definitions all the same names. It will usually make more sense to humans that way, and software can track the separate amounts of species in the different compartments by their identifiers.

Note that beginning with SBML Level 2 Version 2, there are explicit constructs for species types and compartment types. If you are using names to convey the idea that different entities are the same conceptual "thing" despite having different identifiers, you may want to indicate the relationship more strongly by defining common species types or compartment types, and then declaring the species/compartments to be of the appropriate types.

Why doesn't SBML Level 2 define a default compartment?

Software developers are sometimes bother by the fact that SBML does not specify a default compartment; all compartments in SBML must be defined explicitly. There are several reasons for this:

  • A model that uses a single unit-volume compartment is making explicit an important underlying assumption about the model. Leaving it implicit would be more prone to errors.
  • SBML would have to define a reserved identifier to refer to the default compartment. This is a recipe for an eventual identifier collision when someone, somewhere, accidentally uses the same identifier.
  • A default compartment would only save effort in developing the SBML writing component of a software tool. The writing component is the easy part; reading and interpreting is the harder part. Defining a default compartment would not help readers much, if at all.
  • A default compartment would be a special case which all SBML parsing programs would have to handle specially.

How do you represent models that don't define a compartment?

It will be necessary to create a compartment in the SBML representation of the model. One approach is to locate all species in a single compartment with unit volume. The default units system of SBML will ensure that this unit volume representation is exactly equivalent to a model dealing with concentrations, including rate laws defined in substance/volume/time units.

When making changes like this to accommodate SBML requirements, it is a good idea to write a note (perhaps stored in a <notes> element inside the top-level <model> element) explaining what has been done. This will help future readers of the SBML file to understand why certain choices were made.

Why is there a distinction between "assignment" and "algebraic" rules? Aren't they equivalent?

Although it is typically easy to transform between assignment and algebraic rules, SBML provides separate constructs for them, for the following reasons:

  • Algebraic rules define the point in the model where there is a circular dependency between variables. For instance, the equations x = 2 * y and y = x + 1 have a circular dependency. It is not possible to form such a dependency in scalar rules (see the SBML Level 2 specification). At least one of the example equations would have to be encoded as an algebraic rule in SBML.
  • Many tools are not capable of supporting algebraic rules (DAEs).
  • Those tools that do support algebraic rules make the distinction between assignment rules and algebraic rules.

Why can't user-defined functions be recursive in Level 2?

Functions definitions in SBML Level 2 are designed to allow them to be substituted in place of the function call operator; that is, they are deliberately defined so that software tools can treat them like macros rather than functions. This would not be possible if functions were allowed to be recursive.

Why doesn't SBML provide a way to define constants?

It does. Use the SBML parameter construct and set the attribute constant to true. See the next question.

Are you saying that parameters may not be constant in SBML? That's crazy talk!

Yes, that's what we're saying, but it's not crazy talk. There are at least two reasons for doing it this way:

  • The object data structure defining a variable (other than species or compartment) and a constant would be nearly identical. The only difference is that one would be called constant and the other allowed to vary. SBML simply uses a more parsimonious representation involving the use of just one object, with a flag, constant, indicating whether the symbol value is constant during a simulation.
  • Some modelers and software systems actually do use the concept of time-varying parameters. See, for example, this FAQ item from SAAM II. In SAAM II, "any parameter could in fact be defined as time-varying".

And you probably thought we were just making this stuff up!

Why was the constant attribute on species and compartments introduced?

If a model does not contain algebraic rules, it is possible to infer which components (species, compartment and parameters) are meant to change in value by examining the set of scalar rules, rate rules, and reactions. However, if a model contains algebraic rules, you need information about which symbols are meant to be variables and which constants, to solve the system of equations. The mere occurence of a symbol in an algebraic rule doesn't imply that the symbol is a variable.

OK, you ask, but why have a constant flag on species? Why not define constant species as parameters instead? Well, if you define something as a parameter, you lose information about the nature of that symbol: a parameter has nothing on it that says whether it is meant to be the amount of a species, or a volume of a compartment, or a numeric constant of any other kind. However, such semantic information may be useful in a model—it may be useful for a model interpreter to be able to determine that a particular symbol is meant to be treated as a species amount or concentration. For example, graphical editors may be able to use this information. Defining something as a species says to an SBML reader, "this is a species, not some numerical or physical constant that defines a characteristic of the system". More detailed semantic information can of course be added using SBO labels or other annotations, but these annotations are not widely supported, so SBML provides these very primitive facilities to help software get by.

What is this "boundary condition" business?

The boundaryCondition qualifier codifies the following notion: In some systems of reactions, certain chemical species are unchanged by the reactions in the system. For example, this might happen if there is a vast overabundance of the species compared to the other ones, or the particular species is maintained (buffered) by some external means. In models of such systems, it is important to be able to indicate that the model interpreter should not generate an ODE for that species. In SBML, this is indicated by setting the boundaryCondition attribute of that species to "true".

A critical point in the previous paragraph is the part that says the model interpreter should not generate an ODE from the system of reactions for species that are labeled as boundary conditions. It does not mean that there cannot be an ODE or other constraint generated as a result of a rule in the model. You can have boundaryCondition="true" for a species and that species can still appear on the left-hand side of an ODE if there is an SBML Rule for it.

Summary: boundaryCondition="true" means do not generate an ODE for this species from the stoichiometry matrix of the system of reactions.

Why can't you assign different units of time to (e.g.) event delays?

SBML Level 2 Version 1 provided this capability. It defined unit attributes on various SBML components such as kinetic laws and event delays, letting a model redefine units for individual quantities. Unfortunately, this turned out to introduce serious practical problems. First, one could construct models in which it was impossible, without additional information, to convert quantities to the same consistent units throughout the model (a necessary prerequisite to constructing a system of equations from the model definition). Second, and in practice more important, the freedom to reassign units in so many different contexts may have been convenient for model writers, but it made it hugely more difficult for model readers to interpret a model—it placed a large burden on the software interpreting a model. And third, it was much more error prone, with modelers creating models where they did not realize they had made unintended errors in unit consistency.

SBML Level 2 Version 2 removed most of the places where units could be redefined on individual components, but left some (notably, the time units on event delays). SBML Level 2 Version 3 further removed these attributes. These actions were taken based on the experiences of SBML users and developers. (See, for example, this discussion thread from 2005.)

A parameter has no units declared; what units does it have?

SBML assumes that the parameter has the units appropriate for its use within a model. In some cases it may be possible to derive these units from a mathematical expression using the parameter; assuming that the units of all other parts of the expression are known.

However, if parameters with undeclared units are used, it makes checking unit consistency difficult - if not impossible. It is therefore advisable, where possible, to include units for parameters within a model.

I want to use fractional exponents on units, how can I do this?

The SBML unit construct restricts the attribute exponent to an integer value. Thus, it is not possible to explicitly declare a unit with a fractional exponent. There are also restrictions on the units of expressions to which power or root functions may be applied. These restrictions are required to ensure that parameters and mathematical expressions used within SBML are physically sensible.

It is possible to overcome the restrictions by declaring additional parameters, with appropriate units, that can be used to normalise values within expressions. For example, consider an expression such as [A]^\frac{1}{2} * [B]^\frac{1}{2} where [] denotes a concentration. This would not be a valid expression within SBML since it produces intermediate units of concentration1/2. To correctly encode this, declare a parameter p, with value 1 and units equal to the units of concentration. Using this parameter and rewriting the expression as [\frac{A}{p}]^\frac{1}{2} * [\frac{B}{p}]^\frac{1}{2} * p produces the same numeric result, whilst preserving physically sensible units at all stages of the calculation.

Does the 'same units' in assignments mean dimensionally or actually equivalent?

It means they must actually be the same!

There are several constructs in SBML where a mathematical expression can be used to assign value to a variable (species, compartment or parameter) within the model. The specification states that the units of both sides of the equation should be the same. This refers to the actual physical unit, not the dimensionality—metre is not the same as foot !

Why does SBML Level 2 require an XML declaration?

Readers familiar with XML may note that XML version 1.0 does not require an XML declaration; the requirement was introduced in XML 1.1. Nonetheless, SBML Level 2 requires the declaration. The motivation comes from the practical experiences of SBML software developers, who have found that different XML parsers on different operating systems make different default assumptions if the XML declaration is omitted. Requiring the declaration of the XML version and encoding is an aid to greater compatibility between different systems exchanging SBML.

What is the hasOnlySubstanceUnits attribute for?

Broadly speaking, a value of true for hasOnlySubstanceUnits on a species declaration means that wherever the species' identifier appears in a mathematical formula, its units are to be interpreted as substance units only, and not substance/size (i.e., concentration or density) units. Note that this is regardless of how the species' initial quantity is defined: no matter whether the species is given a concentration or a substance value, if it has hasOnlySubstanceUnits=true, then the identifier of the species always stands for substance units.

This is an admittedly badly-named attribute. A better name might have been "symbolMeansAmount" or "hasSubstanceValue". Despite the poor name, the consensus among SBML people is changing the name is not worth the cost and hassle of backward incompatibilities a name change would create.

What is the symbol for time in SBML?

The way to access time (i.e., the current time in "simulation time") is using the MathML <csymbol> construct. This is probably easiest to explain using an example:

<math xmlns="http://www.w3.org/1998/Math/MathML"> 
  <apply> 
    <plus/> 
      <ci> x </ci> 
      <csymbol encoding="text" definitionURL="http://www.sbml.org/sbml/symbols/time"> 
           t 
      </csymbol> 
   </apply> 
</math> 

The expression above encodes the formula x + time, where time signifies the current point in time during a simulation.

Important: there is frequent confusion around the purpose of the content of the <csymbol> element (i.e., the t in the example above). It is meaningless and inaccessible to simulations. The t has no relationship (except perhaps accidental) to whatever symbol might represent time in a given software environment or model. According to the MathML specification, software tools may display this content (again, the t) to users, so very often, the content of the <csymbol> is chosen to be something evocative. But the actual entity representing time in an SBML model is the <csymbol> element itself, not the content of this element. In summary, don't pay attention to t.

Why doesn't SBML require consistent units?

SBML Level 2 Version 4 and SBML Level 3 do not require models to have units declared or to have consistent units—correctness and consistency of units is not a condition for a valid SBML encoding of a model. This may seem strange, so some words of explanation are warranted. The decision to relax requirements of unit consistency was made via a community vote in 2007 and it represents a change from Level 2 Version 3. The realization that this position had to be taken resulted from many people's long experience with encoding models. There are multiple reasons for the decision, but probably the most convincing argument is the following. There exist models in the published literature that have inconsistent units. Regardless of what one thinks about such models, if we want to allow SBML to encode them as published, SBML cannot require consistency of units as a precondition of a valid SBML encoding. If an inconsistency were treated as an error of SBML encoding, then it would be impossible for SBML to encode such models.

Why can't I use the <rem> operator in SBML MathML?

When the decision came about to use MathML instead of infix strings, it was decided to keep the MathML subset allowed in SBML documents as small as possible. The allowed subset mirrors closely what was allowed in the infix format. This was done to ensure rapid adoption of MathML.

Because of these reasons some relatively straightforward MathML operators such as <rem> (remainder) were omitted. It should be noted however that it is relatively easy to implement the missing functions using User Defined Functions. On the example of the remainder, we could rewrite the rem operator as:

 a rem b = a - b*floor(a/b)

which in SBML would look like this:

<functionDefinition id="rem" name="remainder">
  <math>
    <lambda>
      <bvar>
        <ci> a </ci>
      </bvar>
      <bvar>
        <ci> b </ci>
      </bvar>
      <apply>
        <minus/>
        <ci> a </ci>
        <apply>
          <times/>
          <ci> b </ci>
          <apply>
            <floor/>
            <apply>
              <divide/>
              <ci> a </ci>
              <ci> b </ci>
            </apply>
          </apply>
        </apply>
      </apply>
    </lambda>
  </math>
</functionDefinition>

in order to use this remainder then in another MathML expression, one would call it like any other user defined function:

<math>
  <apply>
    <ci> rem </ci>
    <ci> a </ci>
    <ci> b </ci>
  </apply>
</math>

Note: Strictly speaking the remainder operator is:

 a rem b = a - b*truncate(a/b) 

where truncate always rounds toward zero. If that would matter for a model above one would replace the <floor/> expression by a piecewise expression:

 piecewise(floor(a/b), gt(a/b, 0), ceil(a/b)) 

to read:

 
<functionDefinition id="rem" name="remainder">
  <math>
    <lambda>
      <bvar>
        <ci> a </ci>
      </bvar>
      <bvar>
        <ci> b </ci>
      </bvar>
      <apply>
        <apply>
          <minus/>
          <ci> a </ci>
          <apply>
            <times/>
            <ci> b </ci>
            <piecewise>
              <piece>
                <apply>
                  <floor/>
                  <apply>
                    <divide/>
                    <ci> a </ci>
                    <ci> b </ci>
                  </apply>
                </apply>
                <apply>
                  <gt/>
                  <apply>
                    <divide/>
                    <ci> a </ci>
                    <ci> b </ci>
                  </apply>
                  <cn type="integer"> 0 </cn>
                </apply>
              </piece>
              <otherwise>
                <apply>
                  <ceiling/>
                  <apply>
                    <divide/>
                    <ci> a </ci>
                    <ci> b </ci>
                  </apply>
                </apply>
              </otherwise>
            </piecewise>
          </apply>
        </apply>
      </apply>
    </lambda>
  </math>
</functionDefinition>

Questions about the SBML development process

Where did the name "SBML" come from?

When SBML was first conceived, around the year 2000, Hiroaki Kitano suggested the name Systems Biology Markup Language. The name stuck.

What is the overall SBML development process?

SBML development has been and continues to be motivated and directed by the systems biology community. The process is managed by the SBML Editors (see next question), but they do so under the control of the community. The editors collect proposals for changes to SBML from the SBML Working Groups and from other groups and individuals, and then seek to establish a consensus in the community about how to proceed with the proposals. With this information, the editors assemble some of the proposals into a draft specification for a new edition of SBML. After this draft has been reviewed by the community, it becomes a final specification for the new edition of SBML. (Edition in this context can be either a new SBML Level, or a new version of an existing level, or a new release of an existing version.)

Who are the SBML Editors?

The SBML Editors are listed on a separate page on the sbml.org website.

What are the "SBML Forum" Meetings?

These are biannual face-to-face meetings of the SBML community. The formal title of the meetings is the Workshops on Software Platforms for Systems Biology. They are held as satellite workshops of the annual International Conference on Systems Biology (ICSB), usually in the fall or early winter of every year. SBML Forum meetings allow for significant discussion of new SBML proposals and interoperability issues. Presentations and other materials of every meeting are archived in the Events area of the SBML.org website.

Why isn't SBML part of a standards process like OMG?

Some time ago, the SBML Editors at the time considered submitting SBML as a proposal to the Object Management Group (OMG) in response to a request for proposals (RFP) for pathways representations. However, the SBML community decided at the 7th Forum meeting that while it would be useful to have the endorsement of a standards body like the OMG, people's time and resources would be better spent working on SBML development rather conforming to all the standards requirements of the OMG process.

This does not rule out the possibility of seeking standards-body recognition sometime in the future.

How do I report errors and issues in the SBML specifications?

Please use the issue tracking system in the SourceForge project area for SBML. To view reported issues in the tracker, use the Category pull-down menu to select the desired SBML Level+Version (e.g., "SBML Level 2 Version 3"), and then click "Browse".)

How do I propose changes or additions for SBML Level X?

There are several ways, with the first one below being preferred because it's the quickest and easiest:

  1. Start a discussion on the sbml-discuss list/forum. This is sure to provoke a response Image:icon_smile.gif. Doing so also helps find out whether the capability is not already in SBML in some other form, because someone will point out if it is.
  2. You can also attend an SBML event, in particular the annual SBML Forum meetings, where proposed changes to SBML are a major discussion topic.
  3. Finally, if you are shy or just want to pose a question in advance of making public statements, you can send email to the SBML Editors.

SBML development is too slow—can't it be faster?

This is a can't-win situation. The archives of the sbml-discuss mailing list as well as anecdotes from SBML workshops show that for every person who complains about SBML development being too slow, there is another who complains SBML is changing too rapidly. It seems impossible to please everyone.

Where does the funding come from to keep SBML development going?

The initial development of SBML from its inception through the year 2003 was principally funded by the Japan Science and Technology Agency under the ERATO Kitano Symbiotic Systems Project headed by Hiroaki Kitano. Many agencies and commercial organizations supported smaller parts of overall SBML development as well as workshops and travel expenses. Many more academic organizations supported people who spent considerable time working on SBML and related projects despite that it was not an official aspect of their research. Since 2003, the primary source of stable funding has been the National Institute for General Medical Sciences under grant GM070923 to Michael Hucka (Chair of the SBML Editors). A more detailed list of funding acknowledgments is available on a separate page.

Miscellaneous questions

I have some SBML that hasn't been formatted nicely. Is there a way to clean it up?

LibSBML includes a demo program that simply echoes whatever SBML is given to it, and in the process of writing the output, it does a pretty reasonable job of pretty-printing the XML. The reformatting facility is actually built into libSBML (it's what libSBML does automatically), not the demo program, which means any application you build with libSBML will do it too.

If you want a more general solution with more control over formatting, you may want to look at HTML Tidy, a free, open-source, general-purpose pretty-printer which (despite its name) will work with XML too. It can be embedded into applications.

What is the difference between sbml.org and sbml.info?

There is no difference. They are alternative names for the same website, provided as a convenience to SBML users and web searchers. We tend to refer to "the SBML site" or "the SBML portal" as being http://sbml.org, but the other address should work just as well.

Who runs sbml.org?

The SBML Team maintains the server at the BNMC, located at the California Institute of Technology, in Pasadena, California, USA.

What's the difference between all the SBML mailing lists?

There are currently six mailing lists. The first five listed below all have web interfaces for people who prefer to interact with them that way; the last one is an automatic notification list hosted by SourceForge.net.

Link to subscription page Purpose of list/forum Link to web forum Link to RSS feed
sbml-announce Ultra low-volume, broadcast-only list for announcements of high importance to the SBML community, such as new releases of SBML specifications and upcoming community events. sbml-announce
sbml-discuss Main list for SBML development and community interaction. Announcements of new or updated SBML-compatible software are accepted, but no other advertisements are permitted. sbml-discuss
sbml-interoperability Discussions of use and interoperability of all software that supports SBML. LibSBML questions and other topics are perfectly acceptable here. sbml-interoperability
libsbml-development Technical discussions specifically about libSBML and its development, including requests for new features and questions about its operation. libsbml-development
jsbml-development Technical discussions specifically about jsbml (a new Java SBML library) and its development. jsbml-development
sbml-svn Automatic broadcast-only list receiving mail whenever a change is committed to the SVN repository. (There is only a single list for all changes to the SBML project. SourceForge.net does not provide facilities for limiting the notifications to a subproject. You may wish to set up your own mail filters to do that.) N/A N/A


Under no circumstances will we willingly divulge the memberships of the mailing lists to third parties.

Does SBML have a Twitter feed?

Yes! It's @sbmlnews.

Retrieved from "http://sbml.org/Documents/FAQ"

This page was last modified 22:17, 15 July 2010.



Please use our issue tracking system for any questions or suggestions about this website. This page was last modified 22:17, 15 July 2010.