this is exactly the answer I feared was coming!
This means that there are going to be programs that rely on the sboTerm for simulation instead of reading the formula.
I can understand that it is a lot less work to parse a string then it is to parse a formula. So in the scenario where the sboTerm is not consistent with the formula, a program that parses the formula will create different simulation results than the program that only parses the sboTerm and I think this is bad for the users.
I think we should either state that the forumla if present is to be read or we should make the formula and the sboTerm mutualy exclusive for the kineticLaw, which is not a nice solution.
I think it should only be allowed to use the sboTerm to simulate the model if there is not formula present. (Can there be a kineticLaw without a formula? Since the math field is not optional, can it be empty?)
I think that it is very important to state which is the correct behavior in case of potentialy contradicting information because only one behavior can be correct here. And since it is not trivial to make a consistency check in this case, I think we should state that it is wrong to use the sboTerm to simulate the model if there is an equation present?
So this said, if there is no way to to have a kineticLaw elemtn without a formula, the sboTerm is useless at least for deterministic simulation programs?
OK, I guess after this sentence Nicolas is going to come up with the Michaelis-Menten vs. Briggs-Haldane example where you need the sboTerm to distinguish those two, but are there any other examples where this would be needed?
From: email@example.com on behalf of Howard Salis
Sent: Fri 12/16/2005 8:43 PM
To: SBML Discussion List
Subject: Re: [sbml-discuss] SBML L2v2 specification vote #4: References to controlled vocabularies
The advantages of the sboTerm (as I understood it, but someone
please correct me if I'm wrong) are:
1) A compiled program can read in a unique identifier classifying a
reaction / species / kinetic parameter and match it to a list of
hard-coded reaction rate laws.
Without the unique identifier, the compiled program must parse the
MathML expression in the kineticLaw and repeatedly evaluate the
expression using a relatively expensive operation.
2) Two MathML expressions may be (string-wise) different, but are in
fact algebraically identical. How do you a) simplify the algebraic
expression to the least expensive form for evaluation b) test if two
MathML expressions are algebraically identical? (a) is important if
you are parsing the MathML expression and evaluating it. (b) is
important if you want to compare two models, analyzing the differences
in reaction rate laws/etc. It is also easier to label a species as
'Inhibitor' and the reaction 'Michaelis Menten Inhibition' than to parse
the MathML reaction rate law expression and identify which variable in
the MathML expression is the inhibiting species.
3) It is much easier to make a mistake in the reaction rate law than in
the sboTerm, in my opinion. There are many (in fact, infinite) ways to
write down the standard Michaelis Menten expression, but there will only
be one unique identifier for that expression.
4) Not everyone programs in an interpreted or object-oriented language.
You have to understand that most supercomputers do not use Java and new
architectures usually do not have the Python/Perl/etc interpreters
ported in a timely fashion. How often do new & useful architectures
arrive? Consider the Sony/Toshiba/IBM Cell processor and its potential
use in scientific computing.
5) Also, new (and old) simulation techniques for studying biological
systems do require a lot of computing time. Any additional costs for
evaluating algebraic expressions are big obstacles. The sboTerm removes
that obstacle. Consider most Molecular Dynamics software packages. Even
the presence of a square root in a calculation is usually eliminated via
mathematical manipulation of the expression or an approximation. Special
routines are written solely for evaluating inverse square root
expressions. These optimizations are responsible for 10-20% savings in
time which saves days/weeks of computing time on modern supercomputers.
So there are some good reasons why using the sboTerm is advantageous.
But, like you said, adding new rate laws so the whole community may use
them does require communication with the SBO database. Nicolas can
probably say more about how adding rate laws and other sboTerms to the
SBO database will be handled.
> My own view is that the formula should be the correct one and that
> programs should not rely on the sboTerm (which is not backwards
> compatible). But, if a program does check that they are equivalent, it
> should give a warning that they don't match. The specification does
> not have to state which is correct because everyone who uses SBML will
> make sure they have got the equation type right, won't they?
> I can't find it in the spec at the moment, but I think the sboTerm may
> be optional.
> 1) Otherwise how can you create a new type of rate law for a
> particular model before changing the SBO to add the law?
> 2) For backward compatibility.
> The advantages of having the sboTerm in the kinetic law have been
> given as speed of compilation, but I can't see why one can't compile a
> model once and then merely alter parameters in (a copy of) the
> compiled model before running, rather than automatically recompile
> thousands of times. Knowing that I could do it that way in an
> object-oriented language, I failed to follow the argument for doing
> the longer process in a loop as being more efficient.
> Hugh Spence
> GSK Scientific Computing and Mathematical Modelling
> Medicines Research Centre
> Gunnels Wood Road
> SG1 2NY