SBML Level 3 Arrays and Sets
Last update to this page: 2008-9-19.
The arrays and sets proposals for SBML are concerned with supporting mathematical arrays and sets of components such as compartments, species, submodels, and other SBML entities. The primary motivation for proposing this capability is that many types of models use large numbers of more-or-less identical components, and it is convenient (if not practically necessary in very large models) to be able to use an indexing scheme to reference these components. (Or as Bruce Shapiro put it succinctly in 2006, "Arrays allow us to describe a bunch of stuff without listing every item explicitly every time".) Another motivation is the desire to support models having elements and structures whose spatial geometries are not important; these elements might more conveniently be referenced as indexed entities rather than individually-named entities. A final motivation is to support abstract mathematical models.
Use-case motivating the need for this package
Here is an example model created by Bruce Shapiro. The model is based on the "Activator Model" in Fig. 4 of the 2005 paper by Jönsson et al.. The model consists of 253 cells, each of which behaves according to a small set of equations. The cellular template is based on real data, a laser scanning confocal horizontal cross-sectional image of an arabidopsis meristem with GFP labeled nuclei. Cell-cell connectivity was derived using a Delaunay triangulation procedure, which analysis has shown to be approximately 97% correct in the meristem. The template looks like this (colors do not have significance here):
The model is generated (in xCellerator, a package written in Mathematica) using a relatively simple set of rules that establish the internal and external networks of interactions. Here is the core of the loop; this is in xCellerator notation, but one can get the basic idea even without knowledge of xCellerator:
The overall model that results from running the code to generate 253 cells in the configuration above contains 2025 internal reactions and 2119 intercellular reactions. The sbml model (without using arrays) can be viewied [here]. Using an array notation, the cells and reactions do not have to be named individually—they are simply referred to using integer indexes. This vastly simplifies code and allows using loops and other ways of manipulating the cells and reactions in the model.
Current status of proposal
As of this writing (18 Sep. 2008), the last known publicly-issued proposals in this area are
These proposals largely overlap, but differ in their support for dynamic creation/destruction of components. Both of these proposals need significant updating to fit into the syntax and organization emerging for SBML Level 3. These proposals also need to be combined into one consensus proposal.
History of proposals
Proposals for supporting arrays and set notations have a long history in SBML. Here is a reconstruction in chronological order:
- Andrew Finney was probably the first to formulate a proposed SBML extension to support arrays in mid-to-late 2000, as part of a collection of proposals for new SBML development. The extension was discussed publicly on the occasion of the 3rd Workshop on Software Platforms for Systems Biology in 2001.
- Also at the 3rd Workshop on Software Platforms for Systems Biology in 2001, Eric Mjolsness and Bruce Shapiro discussed the need for indexed elements in the context of models where the numbers of structures and their interconnections change dynamically. This was not specifically a proposal for SBML, but Eric and Bruce later proposed (and Bruce implemented in software) an SBML extension, described below.
- Andrew Finney, Victoria Gor, Ben Bornstein, Eric Mjolsness and Hamid Bolouri continued to work on the arrays in SBML, and Andrew presented a summary of the April 2002 proposal at the 5th workshop in 2002. Note that this work was done at a time when Level 2 was envisioned as essentially what is Level 3 today, and so the proposals and discussions refer to Level 2. (However, the 5th workshop was also where the SBML community decided that Level 2 would be an incremental update to Level 1 with fewer changes, so the landscape of proposals changed after this event. Subsequent proposals were in relationship to Level 3.)
- Eric Mjolsness made a renewed plea for why arrays are important and should be supported in SBML, at the 7th Workshop on Software Platforms for Systems Biology in 2003. At the same meeting, Andrew Finney presented a summary of an updated version of the Finney/Gor/Bornstein/Mjolsness/Bolouri proposal (this one dated March 6, 2003) for SBML array support, for SBML Level 3.
- Discussions in the 2003 led to the question of whether sets should be supported along with arrays. Andrew Finney outlined a proposal for sets and collections on the sbml-discuss mailing list in September, 2003, and at the same time, released an updated version of the arrays proposal by himself, Gor, Bornstein, and Mjolsness.
- Andrew went on to develop and then present an initial reference implementation of the sets/collections proposal at the 8th workshop in November, 2003. This reference implementation has not been maintained but the code is still available as the wildfire project in the SBML SVN repository.
- Bruce Shapiro went on to implement support for a subset of the Finney et al. arrays proposal in MathSBML for the Computable Plant project. Bruce presented this work at the 9th workshop in October of 2004.
- Bruce Shapiro, Victoria Gor and Eric Mjolsness developed a formal proposal for SBML Level 3 and issued it in December, 2004. This proposal differs from the previous arrays proposals by including support for dynamic array sizes and connection rules.
- At the urging of Mike Hucka, Bruce and Eric renewed their efforts to build interest in arrays for SBML at the 11th workshop held in October, 2006. They made two tag-team presentations, one by Eric and another by Bruce.
Additional reference material
The following is in no particular order.
- In the old SBML wiki pages for the SBML Arrays working group, there was a page discussing Indices or not indices.
- MathML 2.0 includes a notation for arrays matrices, sets, and lists. (However, as noted by Bruce Shapiro in his presentation of October, 2006, the vectors and matrices provided in MathSBML 2.0 are not sufficient.)