|
Michael Hucka wrote:
> salis> It would be much easier to evaluate the rate laws
> salis> of each reaction if they all had standard forms. A
> salis> MathML expression is not unique. The same algebraic
> salis> expression can be manipulated innumerable
> salis> ways. Given two models with the same reactions, but
> salis> different rate laws, it is still possible that
> salis> those models are exactly equivalent. If each rate
> salis> law had a standard form, not only would the
> salis> expression and meaning of the rate law be clear and
> salis> concise, but it would also make it unique. The
> salis> unique identifier (it doesn't have to be an
> salis> integer, it could be a string) could be stored
> salis> alongside the MathML expression so long as the
> salis> program which created the SBML file can match the
> salis> expression with the standardized form of the rate
> salis> law.
>
>You are proposing that every rate law that can possibly be
>placed in a model (SBML or otherwise) be assigned a unique
>identifier, correct?
>
>
>
Yes.
>1. How will you construct the list of all possible rate laws?
>
>
>
It'd be a list of rate law forms used by the community.
In the database:
Each rate law form has a
1) unique identifier (an integer, a string, etc),
2) an algebraic expression as a function of j kinetic parameters and n
state variables so that r = f(k1,...,kj, X1, ... , Xn),
3) a metadata description of what each kinetic parameter means and what
are its units,
4) and a metadata description of what each state variable is and what
are its units.
5) (Optional) Restrictions on the values of kinetic parameters, such as
positive, real, or integer only.
The ordering of the kinetic parameters and state variables must be
fixed. The kinetic parameters and state variables in the rate law form
must have a one-to-one mapping to the value of a kinetic parameter or
state variable in the SBML model. So for the standard Michaelis Menten,
the rate law in the database would be something like
Identifier: 'xxx'
r = k1 * X1 * X2 / (k2 + X2),
k1 --> 'kcat',
k2 --> 'Km',
X1 --> Substrate,
X2 --> Enzyme.
In the SBML model, the kinetic parameters in each reaction should be
either stored in a specific order and mapped to k1, k2 or
the string identifiers of each kinetic parameter can be matched to the
metadata to discern the ordering. Also in the SBML model, the identities
of the state variables X1, X2, .. Xn is stored in a specific order so
that if the integer identifiers of the species is N = {3, 6}, then #3
Species --> X1 and #6 Species --> X2.
In the simulation program, the rate law can be evaluated using the rate
law identifier, the list N, and the values of the kinetic parameters
(labeled K).
If the concentration of the chemical species (the state variables) is
stored in an array C, then the rate of the Michaelis Menten reaction is:
r = K(1) * C(N(1)) * C(N(2)) / ( K(2) + C(N(2)) )
The program can be compiled to explicitly know that whenever a reaction
follows a Michaelis Menten rate, its reaction rate can be computed using
the above expression.
Each reaction in the SBML model also has stoichiometric coefficients
that describes what happens to each state variable when a reaction occurs.
The list of species directly affected by the reaction _does not need to
be_ the same as the list of species whose values are substituted in the
rate law. The two lists are separate.
This is the easy question. I currently use this mechanism in my own code
and it works both fast and well.
>2. Who will have authority to decide the correctness of a
> particular rate law statement? (Errors are believed to
> exist in long-established software packages as well as
> printed publications, but who arbitrates?)
>
>
>
Well, the authors of SBML should have authority considering its usage
will primarily be for SBML.
As for errors: As long as the rate law form produces real numbers and is
not a repeat of an already existing form then it should be tentatively
accepted and given an identifier.
Internet-based peer review (comments, forum, Wiki-style, etc) can be
used to notify you guys if there's an apparent error. It's better to
have the expression open and viewable by everyone instead of buried in
an SBML file. It also saves time and effort by grouping reactions with
rate laws with equivalent forms into a single unique identifier. If the
rate law form is deemed correct, then it also applies to all reactions
using that rate law form in all SBML files. Otherwise, a buried typo in
the MathML expression for the 345th Michaelis Menten reaction in a SBML
file may be missed.
>3. What happens when an existing definition of a rate law in
> this master list is discovered to have errors? In
> particular, what is the implication for all the software
> tools that conformed to a specification of the list at
> one time, but no longer do because definitions have been
> updated?
>
>
Here is where the answers get trickier, but here are some possible
solutions:
Each correction of a rate law form with a known error will be added to
the database with a different identifier. That might lead to supreme
bloat if numerous corrections are made. Programmers would need to add
the new identifiers and new rate law forms (and recompile) in order to
include the corrections.
Each rate law form has a version number. Corrections have the same
identifier, but with an increased version number. Programmers would need
to include the corrections in the rate law forms in their existing code
and recompile. They would also have to report which version of the rate
law form they are using so the user knows exactly what the algebraic
expression is.
However, compare this to the current way of doing it for compiled
languages like C or Fortran: a MathML expression is converted to source
code, the program is recompiled, and the simulation is run. Every time
the MathML expression changes (from model to model) the program is
recompiled.
Instead, the database allows C/Fortran programmers to recompile only
when new rate law forms are added / modified
>4. What happens when you discover a rate law is missing from
> the master list?
>
>
If the program solely uses the identifiers, then it can only accept SBML
models with reactions all having identifiers.
If a user is using a program that lacks a desired rate law, then they
add the rate law to the database, and bug the programmer to add it to
the software.
A possible bottleneck in the process, but as the list grows, there will
be fewer missing rates. ;)
>5. Obviously it is not enough to simply provide a list of
> all the rate law names; the definitions of the rate law
> formulas must be available too. How will software tools
> obtain these definitions in a machine-readable form?
>
>
>
Well, there's two ways this can be done:
The programmer reads a document containing the list of rate law forms
and their algebraic expressions. He then implements the list in source code.
Or, the algebraic expressions can be encoded in MathML expressions and
the source code for evaluating each rate law form for <X> language is
automatically generated with a tool.
Either way is fine.
>I'm not trying to be flip here.
>
I'd rather have questions about the suggestion than flat out rejection. :)
> <>What you propose was in
> essence tried with the collection of predefined rate laws in
> SBML Level 1. The problems above are very difficult to
> solve when a language specification includes an explicit
> list of function definitions. These problems are part of
> the reasons why the list disappeared in SBML Level 2.
> By providing user-defined functions, SBML Level 2 obviates
> some of the problems above, but (unexpectedly) introduces a
> different one: people want to be able to have names
> associated with the rate laws, but there is no defined way
> of doing this in SBML Level 2. The simple solution of
> providing a label attribute (like rateLawName="hill", etc.)
> seems like it ought to work, except that as soon as people
> attempt to share their models between software tools, they
> run into the problem that different tools (or rather, the
> tool authors and users) use different names, different
> definitions, etc.
>
> What's the alternative?
>
> Instead of putting the definitions into the language,
> several people have been working on a proposal to define a
> scheme using extensible controlled vocabularies. The scheme
> attempts to address each of the problems listed above.
>
As long as the controlled vocabulary uses a controlled mathematical
expression, I'd be happy.
r = kcat * S * E / (Km + S) and r = kcat * E / (Km / S + 1) are both
algebraically equivalent Michaelis Menten, but have different MathML
expressions.
It's difficult to generalize a way to identify the rate law form from
the MathML expression.
-Howard Salis
> <>Watch this space ... :-)
>
> MH
>
>
|