Discussion about SBRML and SED-ML future on morning of 2008 08 28
Sven Sahle: Maybe we can take a couple of minutes to discuss if we should continue these two projects separately, or if we should try to have an effort to unify the overlapping parts, if we can identify the overlapping parts?
Pedro Mendes: We are certainly willing. What's why we labeled this as a draft. Things like URIs vs URLs are easy changes. I think there are 2 approaches. We can either combine the two and have one unified thing. That's one possibility. We can also just make sure they are compatible, make sure that one can refer to the other.
Nicolas Le Novère: As I mentioned already, I think this should be different. I think what you proposed [SBRML] potentially has vastly more outreach than just the simulation community. We really need a format to encode numerical results in systems biology, including experimental data. If we start there, that will involve a larger community, with different constraints. I would rather see a situation where SBRML calls SED-ML, and make SED-ML be compatible with SBRML. Then SBRML can tackle much more efficiently the results part. People should develop this in parallel.
Michael Hucka: Sorry, I think missed the conclusion of that
Pedro Mendes: The conclusion is that Nicolas is arguing we should keep them separate. SBRML can be broader in context. He doesn't want KiSAO describe all the experiments, for example.
Nicolas Le Novère: Yes.
Nicolas Le Novère: So how do you encode the results of the experiment? Say the parametrization of a model based on the results. How do you encode the results?
Pedro Mendes: We don't do that. All that has to be in the model. We start with the SBML model...
Nicolas Le Novère: No, no, I'm talking about comparing a simulation result with experimental data. That's certainly outside. Like you can do in COPASI, where you can import external measurement data.
Pedro Mendes: If we'd have this, we'd import the data file here. The problem is saying what each part of the model is in relationship to the measurement data. That's the same problem [addressed in SBRML]. That's why we ended up doing it this way.
Henning Schmidt: It's really important to have the experimental descriptions. It's really easy to say [for the case of SBML] "just change the parameter by a parameter rule". If instead you describe it properly, you can send it along with the model into another modeling tool to do for example parameter estimation. That's what they need. They want to produce models.
Nicolas Le Novère: So really we're all saying the same thing. What we also need is for people who publish models to publish the descriptions in a standard format.
Pedro Mendes: There are already too many markup languages for that
Nicolas Le Novère: We want to talk to them. By the way, I'm not aware of any markup languages for describing the time courses taken by Western blots, for example.
Pedro Mendes: No, but there are markup languages to describe microarray data. For time-courses involving microarray data, we can just use that stuff and put them in here [SBRML].
Darren Wilkinson: There is a completely different way to approach this, which is actually start with the protocols for encoding experimental data, like FuGE. You could start with the FuGE data model. There's no reason why one of the experiments encoded by the FuGE data model couldn't be used here. That would in some ways be more natural.
Pedro Mendes: I agree to some extent. By the way, Norman Paton is one of the authors of FuGE as well, so we're quite aware of it. Some of the structures here were inspired by FuGE. But you're talking about the description of the operations. We kind of skimmed over that in the presentation. We basically say we don't want to be describing the operation ourselves; we want to point to the description. We could have FuGE as one of the ontologies used in SBRML.
Michael Hucka: So is SED-ML going to be a separate effort?
(Several people): Yes.
Sven Sahle: If I understand it correctly, we are dealing with a little bit different thing than storing the experimental data. What we store here is the connection between the model and the data. It would be raw numbers or computed numbers and how they connect to a model.
Pedro Mendes: Yes. How they connect to the biology is then the domain of FuGE and these other things.
Sven Sahle: That means there will be software to do that step, to read the file that describes the experimental techniques used or whatever, and converts that (maybe by querying a databse) into a file that contains numbers linked to model entities. This can then be used for parameter fitting and so on.
Pedro Mendes: I'm curious about what other people using experimental data think about this. Chris Myers, in your system you have this thing, this task.
Chris Myers: I agree with you it would be really hard to go to this community and import what they've done. I was trying to think about the proteomics community and how to import their data, gel data. Trying to find links out from that I think is what would be problematic. I think also if you try to develop those terms yourself, you have issues. For example in microarray data, you're actually measuring the primary transcript, not a protein, which is what you have in your model. So your ontology has to be informed enough to distinguish what's from the experiment and what's inferred from the experiment.
Pedro Mendes: Or you have to use the right model. You might have a model that has RNA as species. But I agree with you; you have to be careful. The modeler has to know what they're doing.
Chris Myers: It's a whole field of study.
Pedro Mendes: So, your interpretation is what we have in mind. SBRML is really linking the numerical results with the context of the model.