Annotation package workshop 2010
19–21 May 2010
The purpose of this meeting was to work on defining a package for SBML Level 3 which supported more elaborate (full RDF) annotations, as well as to define any other types of annotation not mandatory in the Level 3 core.
The main page for the annot package L3 proposal is available from 
Neil Swainston, Dagmar Waltemath and Allyson Lister handled organization, while Nicolas Le Novère hosted and provided the location and infrastructure for the physical meeting. Using videoconferencing, attendees were distributed across two sites (EBI/Hinxton and University of Rostock). The actual number of groups involved was larger, but some group representatives traveled to one of the main sites to reduce the number of connections and increase local interactions and discussions. See the table below for a list of the attendees and where they joined from.
The organizers of each hub were responsible for organizing the travel and accommodation arrangements for that hub.
Audio-visual connection information
The lead sites were the EBI near Cambridge, UK, and Rostock, Germany (contact: Dagmar Waltemath). We used a combination of systems for remote connections: a Tandberg videoconferencing system link between the EBI and Rostock (at IP address 22.214.171.124) and EVO. See the separate page of instructions for information about EVO.
Detailed discussion topics
On the morning of the first day, we began with a more detailed list of possible topics constructed from everyone's input prior to the meeting, then developed the following condensed list of specific topics. We also assigned "owners" to indicate the lead groups or persons to take the first pass at discussing each topic.
There is currently also a page of general notes that is being modified dynamically and will probably be emptied by the end of the workshop.
Topics discussed in addition to those below included the following:
- Format: How should the annotation package be integrated into SBML LV3 core annotation style and the "free annotation" style with arbitrary RDF?
|#||Owner||Importance||Question||Notes page||See Also|
|1||EBI||High||In general, how does the annotation package relate to all other packages (current and future, core and non-core)? It could be that the packages decide what their annotation means.||Notes||Topics for Discussion #1, #10 (#3)|
|2||Rostock||Regular||How do we structure annotations about annotations? This includes evidence codes, which software/person added the annotation. How does this affect the History annotations? How would RDF reification vs. the current SBML "model history" RDF annotation look like? Would we need to implement rdf:id for things like rdf:Bag, so that one bag, for instance, could refer to another bag? In the history, how to associate a particular modification with a particular person?||Notes||Topics for Discussion #2, #4|
|3||All||High||Can we / should we find a way to add more fine-grained annotations, e.g. to attributes of XML elements? E.g. we want to say that a particular rate is as a result of a particular bit data, with a way to specify a particular piece within a linked data set.||Notes||Topics for Discussion #4, #9|
|4||EBI||High||In order to solve the problem of precedence, would the nesting of qualifiers help? We might need several layers of qualifications. One example is conflicting annotations with hasPart (it's more important to know it's a complex than that it's a version of a protein) and isVersionOf (e.g. hasPart cdk2 and isVersionOf cdk2 for the MPF complex). One answer might be using rdf:alt, or perhaps collections rather than containers. How can we describe boolean operators? Does this cover AND/OR/NOT? (NOT 1: This protein is not P12345, or this protein is not phosphorylated (is this a useful example?); NOT 2: a value was tested for, and not found. Should we include this information? Someone could choose to ignore the annotation, so should annotation be used to negate a value?)||Notes||Topics for Discussion #6, #7, #15|
|5||Rostock||Regular||Is there some way to use annotations to provide unique identifiers that would be valid among multiple models and not just within a model? Perhaps by linking a metaid of a model to a unique identifier of your own within the Annotation section. SBRML etc. to link model to data. Therefore, is #9 in scope for SBML or should the linking be the other way around?||Notes||Topics for Discussion #8, #9|
|6||All||Regular||Cross-element annotation. How can we have something in the RDF that allows reference to another metaid from another element? Could say that parameter X is a property of species Y.|
From the originally proposed list of topics for discussion we put up the following programme. A detailed description of all addressed discussion points follows further down.
Wednesday, 19 May 2010
|Time||Lead person(s)||Topic||Files||Audio only|| EVO|
|13:00–15:15||Allyson||Setting the detailed agenda & workshop work plan|| • Starting list|
• Final subset
Thursday, 20 May 2010
|Time||Lead person(s)||Topic||Files||Audio only|| EVO|
|09:00–12:15||Everyone||Drafting and local discussion at each site||see table|| MP3 1|
| zip 1|
|13:00–14:15||Allyson||Group discussion||see table||MP3||zip|
|14:45–18:00||Allyson||Group discussion||see table||MP3||zip|
Friday, 21 May 2010
|Time||Lead person(s)||Topic||Files||Audio only|| EVO|
|10:30–13:30||Allyson||Fitting in the changes with the existing specification: extension/replacement/parallel/combination of the above.||see table|| MP3 1|
Pre-meeting Background Information
The following comprised background reading material for the meeting.
- The discussion topics list for this workshop.
- SBML Level 3 Version 1 Specification, especially Sections 5 and 6, which concern various types of annotation.
- Participants needed to be familiar with the W3C RDF Standard
- RDFa was relevent to the discussion, and information was available in the W3C RDFa Primer
Participants marked in bold were organisers of the workshop.
|Allyson Lister||CISBAN, University of Newcastle, Newcastle, UK||EBI|
|Neil Swainston||MCISB, University of Manchester, Manchester, UK||EBI|
|Dagmar Waltemath||University of Rostock, Rostock, Germany||Rostock|
|Wolfram Liebermeister||HU Berlin, Germany||Rostock|
|Falko Krause||HU Berlin, Germany||Berlin|
|Marvin Schulz||HU Berlin, Germany||Berlin|
|Christian Knuepfer||Dept. of Math. and Comp. Sci., Friedrich-Schiller-University Jena, Germany||Rostock|
|Ron Henkel||University of Rostock, Rostock, Germany||Rostock|
|Stefan Hoops||Virginia Bioinformatics Institute, Virginia, USA||VBI|
|Frank Bergmann||University of Washington, USA||UW|
|Michael Hucka||Caltech, California, USA||EBI|
|Nicolas Le Novère||EMBL-EBI, Hinxton, UK||EBI|
|Camille Laibe||EMBL-EBI, Hinxton, UK||EBI|
|Morgan Taschuk||CISBAN, University of Newcastle, Newcastle, UK||EBI|
|Goksel Misirli||University of Newcastle, Newcastle, UK||EBI|
|Sarah Keating||EMBL-EBI, Hinxton, UK||EBI|
|Catherine Lloyd||Auckland Bioengineering Institute, Auckland, NZ||EBI|
Attendance was by invitation from the organisers only.
Nested annotations and their meaning
Approaches to nesting from the discussion during the annot package meeting.
There are a number of different approaches to statement negations. At the time of the meeting, the favored method was to negate the existing biology and model qualifiers.
One approach to expressing exclusion of knowledge is a combination of using
rdf:list containers and defining negated biology and model qualifiers ().
By defining a closed number of statements, we can implicitly exclude all other statements (
This solution only allows making the typical positive statements, but by including the rdf:list (see above in relation between statements), it will be possible to exclude everything that is not explicitely defined as positive statements inside the list.
This is, however, only solving part of the problems we had with negations, as often we want to explicitely exclude something, because we just know that it does not hold true, e.g. "Protein A is_not phosphorylated.". Therefore, we propose to extend the list of biomodels.net qualifiers.
A statement in itself is always positive. If we want to express: not (A is B), we need to create a predicate isNot and say A isNot B. This approach prevents us from extending RDF and specifically RDF/XML with proprietary elements which defies the use of generic tools to process and analyze the annotations later on.
<species id="A" metaid="meta_A”> <annotation> <rdf:RDF> <rdf:Description rdf:about="#meta_A”> <bqbiol:isNot> <rdf:List> <rdf:li rdf:resource="B"/> </rdf:List> </bqbiol:isNot> </rdf:Description> </rdf:RDF> </annotation> </species>
- using the Boolean operator NOT (including in the solution to solving nested annotations, see further down... together with OR and AND operators)
The disadvantage of this approach is that we have so far not found a solution to integrating the operators in a way that leads to valid RDF statements.
- using the closed set (
rdf:List) and the predefined RDF resource
The following example specifies that the protein has an empty, closed set of modifications. Effectively, the protein is unmodified.
<species id=”x" metaid=”meta_x” name=“Protein X”> <annotation> <rdf:RDF> <rdf:Description rdf:about=”#meta_x”> <bqbiol:modification> <rdf:List> <rdf:li rdf:resource=”rdf:nil"/> </rdf:List> </bqbiol:modification> </rdf:Description> </rdf:RDF> </annotation> </species>
- using RDF Reification to make a statement, and then provide a second statement that negates the first one. (Christian's suggestion).
<species id=”x" metaid=”meta_x” name=“Protein X”> <annotation> <rdf:RDF> <rdf:Description rdf:about=”#meta_x”> <bqbiol:modification rdf:ID=”statement1" rdf:resource=”urn:miriam:obo.psi-mod:MOD%3400042"/> </rdf:Description> <rdf:Description rdf:about=”#statement1”> <bqbiol:verity rdf:datatype="&xsd;boolean">false</bqbiol:verity> </rdf:Description> </rdf:RDF> </annotation> </species>
Post Meeting discussion points
Annotating non-existence of entities
The annotation package provides a means to express that a particular entity (be it a species, a reaction, a parameter, a compartment etc) is consciously not encoded in a model because we know that it must not be part of the model.
This knowledge is different to the non-conscious absence of an entity, for example, because it was just not thought about in the context of a model.
To express absence of a particular entity we introduce the listOfAbsentEntities element. It contains a number of entity definitions (annot:omittedEntities). Each of them gets a metaID assigned. Through that metaID, each of the ommitted entities can have a number of annotations assigned, following the rules of the annotation package.
The absent entities are encoded as follows:
<annotation> <annot:listOfAbsentEntitities> <annot:omittedEntity metaid="metaidX"/> <annot:omittedEntity metaid="metaidY" /> </annot:listOfAbsentEntitites> </annotation>
The sole name (metaid) of an omitted entity does not hold any reliable semantics with information about that entity. That is why the <rdf:RDF> description block annotates the omitted entities and only then we can infer knowledge about the biological entity that must not occur inside the model:
<annotation> <rdf:RDF> <rdf:description rdf:about="metaidX"> <bqbiol:is> <rdf:Bag> <rdf:li rdf:resource="urn:miriam:obo.go:GO%3A11111111"/> </rdf:Bag> </bqbiol:is> </rdf:description> <rdf:description rdf:about="metaidY"> [..] </rdf:description> </rdf:RDF> </annotation>
An example for the definition of two absent species is given in the following code snippet:
<listOfSpecies> <annotation> <annot:listOfAbsentEntitities> <annot:omittedEntity metaid="foo"/> <annot:omittedEntity metaid="bar" /> </annot:listOfAbsentEntitites> <rdf:RDF> <rdf:description rdf:about="foo"> <bqbiol:is> <rdf:Bag> <rdf:li rdf:resource="urn:miriam:obo.go:GO%3A11111111"/> </rdf:Bag> </bqbiol:is> </rdf:description> <rdf:description rdf:about="bar"> [..] </rdf:description> </rdf:RDF> </annotation> </listOfSpecies>
In the above example two omitted entities foo and bar are defined inside the listOfAbsentEntities. As the list is nested inside the SBML listOfSpecies we can infer that the absent entities foo and bar are both representing absent species. The rdf description links the absent species foo to the MIRIAM URI urn:miriam:obo.go:GO%3A11111111, using the bioqualifier is. The meaning of that annotation is: Inside the model there must not occur a species that is urn:miriam:obo.go:GO%3A11111111.
Meeting discussions resulted in a draft proposal for the Version 0 of the annot package.