Re: SBML L2v2 specification vote #4: References to controlled vocabularies
16 Dec '05 11:43
The advantages of the sboTerm (as I understood it, but someone
please correct me if I'm wrong) are:
1) A compiled program can read in a unique identifier classifying a
reaction / species / kinetic parameter and match it to a list of
hard-coded reaction rate laws.
Without the unique identifier, the compiled program must parse the
MathML expression in the kineticLaw and repeatedly evaluate the
expression using a relatively expensive operation.
2) Two MathML expressions may be (string-wise) different, but are in
fact algebraically identical. How do you a) simplify the algebraic
expression to the least expensive form for evaluation b) test if two
MathML expressions are algebraically identical? (a) is important if
you are parsing the MathML expression and evaluating it. (b) is
important if you want to compare two models, analyzing the differences
in reaction rate laws/etc. It is also easier to label a species as
'Inhibitor' and the reaction 'Michaelis Menten Inhibition' than to parse
the MathML reaction rate law expression and identify which variable in
the MathML expression is the inhibiting species.
3) It is much easier to make a mistake in the reaction rate law than in
the sboTerm, in my opinion. There are many (in fact, infinite) ways to
write down the standard Michaelis Menten expression, but there will only
be one unique identifier for that expression.
4) Not everyone programs in an interpreted or object-oriented language.
You have to understand that most supercomputers do not use Java and new
architectures usually do not have the Python/Perl/etc interpreters
ported in a timely fashion. How often do new & useful architectures
arrive? Consider the Sony/Toshiba/IBM Cell processor and its potential
use in scientific computing.
5) Also, new (and old) simulation techniques for studying biological
systems do require a lot of computing time. Any additional costs for
evaluating algebraic expressions are big obstacles. The sboTerm removes
that obstacle. Consider most Molecular Dynamics software packages. Even
the presence of a square root in a calculation is usually eliminated via
mathematical manipulation of the expression or an approximation. Special
routines are written solely for evaluating inverse square root
expressions. These optimizations are responsible for 10-20% savings in
time which saves days/weeks of computing time on modern supercomputers.
So there are some good reasons why using the sboTerm is advantageous.
But, like you said, adding new rate laws so the whole community may use
them does require communication with the SBO database. Nicolas can
probably say more about how adding rate laws and other sboTerms to the
SBO database will be handled.
> My own view is that the formula should be the correct one and that
> programs should not rely on the sboTerm (which is not backwards
> compatible). But, if a program does check that they are equivalent, it
> should give a warning that they don't match. The specification does
> not have to state which is correct because everyone who uses SBML will
> make sure they have got the equation type right, won't they?
> I can't find it in the spec at the moment, but I think the sboTerm may
> be optional.
> 1) Otherwise how can you create a new type of rate law for a
> particular model before changing the SBO to add the law?
> 2) For backward compatibility.
> The advantages of having the sboTerm in the kinetic law have been
> given as speed of compilation, but I can't see why one can't compile a
> model once and then merely alter parameters in (a copy of) the
> compiled model before running, rather than automatically recompile
> thousands of times. Knowing that I could do it that way in an
> object-oriented language, I failed to follow the argument for doing
> the longer process in a loop as being more efficient.
> Hugh Spence
> GSK Scientific Computing and Mathematical Modelling
> Medicines Research Centre
> Gunnels Wood Road
> SG1 2NY