|
Hello,
I very much support this proposal.
I am working on bridging SBML with BioPAX, where there is no
explicit species, but a species is always implied by a combination of
a location and an entity. Having a species type or entity in SBML
makes bridging easier.
As people begin reusing models, we face issues such as one species
type or compartment in one model corresponding to two species types or
compartments in another. Suppose "MFA" in one model is "MFA1" and
MFA2" in another. Suppose you want to make a model of two cells
reusing a model of one cell. Think of modelling inter-cell
communication or cell division or cell fusion. If the software knows
that each species is an entity in a compartment, much can be automated
and verified.
Think of rule-based modelling. Maybe in the near future, SBML will
be able to express "This reaction happens both in the cytosol and in
the mitochondrial lumen"? Or how about a model that describes the
creation of an unspecified number of new compartments (think creating
vesicles or viruses)?
Annotations, at this stage, are not sufficient to identify a
species. Say, we have an URI of a UniProt record. The record lists
lots of variants and modifications. How do we know which of these
variations and modifications are included? Some modifications may be
crucial for a model, others may be unknown or have no impact.
Take care
Oliver
On Mon, Sep 7, 2009 at 8:20 AM, Robert
Phair<rphair@integrativebioinformatics.com> wrote:
>
> Philosophical and ontological differences are always highlighted when a community of scholars attempts to codify a standard. One such difference was highlighted briefly at the recent SBML Forum at Stanford when it was announced that SpeciesType had been removed from the L3 core Public Review Draft.
>
> While we had invested significant time and resources coding to the L2 spec and making use of SpeciesType, we acknowledge that our use of SpeciesType was basically a hack. Nevertheless, I believe the purpose for which we adopted SpeciesType is important and universal and I would like to propose a simple addition to the L3 core specification that will meet this important and universal need.
>
> >From my perspective, having not been present at the birth of SBML, but having built mathematical models of biological systems for more than 40 years, the SBML object Species is named and used in a way that hides its true ontological character and furthermore makes it impossible to write an algorithm that extracts from a ListOfSpecies the fundamental entities that define each Species.
>
> Here are some of the Species definitions from Alicia Smithâ s justly famous model of Ran transport. These were copied from BioModels.net Model 164.
>
> <listOfSpecies>
> + <species metaid="metaid_0000034" id="Carrier_Cytosol" name="Carrier_Cytosol" compartment="Cytosol" initialConcentration="11.8952664327711" spatialSizeUnits="litre">
> + <species metaid="metaid_0000035" id="Carrier_RanGTP_Cytosol" name="Carrier_RanGTP_Cytosol" compartment="Cytosol" initialConcentration="0.00182967434742422" spatialSizeUnits="litre">
> + <species metaid="metaid_0000036" id="RanGAP_Cytosol" name="RanGAP_Cytosol" compartment="Cytosol" initialConcentration="0.5" spatialSizeUnits="litre">
> + <species metaid="metaid_0000037" id="RanBP1_Cytosol" name="RanBP1_Cytosol" compartment="Cytosol" initialConcentration="2.91577340630959" spatialSizeUnits="litre">
> + <species metaid="metaid_0000038" id="RanBP1_Carrier_RanGTP_Cytosol" name="RanBP1_Carrier_RanGTP_Cytosol" compartment="Cytosol" initialConcentration="0.0842265936904004" spatialSizeUnits="litre">
> + <species metaid="metaid_0000039" id="NTF2_Nucleus" name="NTF2_Nucleus" compartment="Nucleus" initialConcentration="0.560888580955963" spatialSizeUnits="litre">
> + <species metaid="metaid_0000040" id="RanGDP_Nucleus" name="RanGDP_Nucleus" compartment="Nucleus" initialConcentration="0.0466849733424111" spatialSizeUnits="litre">
> + <species metaid="metaid_0000041" id="RCC1_Nucleus" name="RCC1_Nucleus" compartment="Nucleus" initialConcentration="0.4" spatialSizeUnits="litre">
> + <species metaid="metaid_0000042" id="RanGTP_Nucleus" name="RanGTP_Nucleus" compartment="Nucleus" initialConcentration="0.0118032373274648" spatialSizeUnits="litre">
> + <species metaid="metaid_0000043" id="NTF2_RanGDP_Nucleus" name="NTF2_RanGDP_Nucleus" compartment="Nucleus" initialConcentration="0.939111419044037" spatialSizeUnits="litre">
>
> A species is defined by an entity-Compartment pair. Most commonly, the entity is an SBO material entity such as a macromolecule or a simple chemical. The problem that we want to solve is simple:
>
> Given a Species definition, write an algorithm to extract the entity and the Compartment that define the Species.
>
> I assert that this is currently impossible without annotations, and that it is so fundamental a concept that it should be possible in the L3 Core. You can easily extract the Compartment. Why not the entity?
>
> Please understand that neither the species id nor the species Name fulfills this need. There are no rules on the syntax of either that would, in the general case, permit a program to extract the identity of the entity (e.g. molecule or molecular complex) from the species id or the species name.
>
> The encoders of BioModel 164 did as well as they could, and a human might well succeed in picking out the molecular complex Ran:GTP from the species name RanGTP_Nucleus. But given the L3 core spec alone, no general algorithmic solution to the problem of identifying the material entity is possible.
>
> The frequently heard proposal that entity information should be encoded in annotations suffers from multiple weaknesses, most of which were cited at the Forum but to no avail in rescuing SpeciesType from a premature demise.
>
> 1. application-specific annotations will not, in general, be interpretable by a program that imports an SBML file.
> 2. MIRIAM annotations rely on the presence, in some web resource, of exactly the molecule(s) you want to reference â including its post-translational modifications or its â activeâ or â inactiveâ state. Most such annotations terminate at isVersionOf and simply fail to distinguish among related molecules.
> 3. Research work frequently involves molecules that are not yet in any database. SBML should support work on the cutting edge as well as it supports work on textbook pathways.
>
> A more philosophical objection to the current state of affairs is that while Entity is one of the six primary controlled vocabularies in SBO, an SBML user has no way to map to the molecule/molecular complex branch of the SBO Entity tree. Table 5 on page 84 of the L3 Spec is quick to point out that SBML Compartments map to the SBO material entity branch, but it appears ontologically questionable to assert, as Table 5 does, that a SBML Species also maps to the SBO material entity branch.
>
> A Species is â Entity in Compartment,â not just Entity. By saying that Species maps to material entity we are perpetuating the misconception that Species really does identify a molecule or a molecular complex. Indeed, I suspect that many people read Species and think chemical species. If you are fond of logical disputation, you could argue that an entity in an entity IS an entity, but this is not the point. The point is that our current definition of Species is incomplete.
>
> If you work with models that have only one compartment, you never encounter this difficulty because your Species name can be the name of the molecule/entity with no ambiguity. But as soon as you want to put the same molecule in multiple Compartments, the uniqueness requirement of Species id forces you to add some version of _Nucleus to your Species id. This, of course will make sense to a human reader, but that is just not the point of XML.
>
> Thus I propose (based on conversations with Stefan Hoops and Jim Schaff, who bear no responsibility for this post, but whose support I would welcome) that we add to the L3 core a required
>
> <ListOfEntities>
> <entity id=â Arf1â name=ARF1_HUMAN â Š other attributes? />
> <entity id=â BFAâ name=Brefeldin A />
> </ListOfEntities>
>
> and insert an new required attribute in the Species element:
>
> <species id=â Arf1_Golgiâ name=â ARF1_HUMAN in Golgiâ compartment=â Golgiâ entity=â Arf1â initialAmount=â 1e5â boundaryCondition=â falseâ constant=â false />
>
> In summary, the L3 Public Review Draft spec defines a species (p.43) as â a pool of entities that (a) are considered indistinguishable from each other for the purposes of the model, (b) participate in reactions, and (c) are located in a specific compartment.â This proposal aims to make it possible to know immediately what entity is referred to by any given Species element.
>
> A program reading an SBML file should not be left to guess the identity of the entity referred to in a Species element.
>
> A Species is (most often) a pool of molecules in a place. The Species element carefully defines the place with a required Compartment attribute; it should define the molecule equally carefully and the proposed â entityâ attribute would do so in a simple, straightforward and useful way.
>
> ____________________________________________________________
> To manage your sbml-discuss list subscription, visit
> https://utils.its.caltech.edu/mailman/listinfo/sbml-discuss
>
> For a web interface to the sbml-discuss mailing list, visit
> http://sbml.org/Forums/
>
> For questions or feedback about the sbml-discuss list,
> contact sbml-team@caltech.edu
>
--
Oliver Ruebenacker, Computational Cell Biologist
BioPAX Integration at Virtual Cell (http://vcell.org/biopax)
Center for Cell Analysis and Modeling
http://www.oliver.curiousworld.org
____________________________________________________________
To manage your sbml-discuss list subscription, visit
https://utils.its.caltech.edu/mailman/listinfo/sbml-discuss
For a web interface to the sbml-discuss mailing list, visit
http://sbml.org/Forums/
For questions or feedback about the sbml-discuss list,
contact sbml-team@caltech.edu
|