libSBML C++ API  5.20.2
L3FormulaFormatter.h File Reference

Formats an L3 AST formula tree as an SBML formula string. More...

Include dependency graph for L3FormulaFormatter.h:
This graph shows which files directly or indirectly include this file:

Functions

char * SBML_formulaToL3String (const ASTNode_t *tree)
 Converts an AST to a string representation of a formula using a syntax derived from SBML Level 1, but extended to include elements from SBML Level 2 and SBML Level 3. More...
 
char * SBML_formulaToL3StringWithSettings (const ASTNode_t *tree, const L3ParserSettings_t *settings)
 Converts an AST to a text string representation of a formula, using specific formatter settings. More...
 

Detailed Description

Formats an L3 AST formula tree as an SBML formula string.

Author
Lucian Smith

Function Documentation

◆ SBML_formulaToL3String()

char* SBML_formulaToL3String ( const ASTNode_t tree)

Converts an AST to a string representation of a formula using a syntax derived from SBML Level 1, but extended to include elements from SBML Level 2 and SBML Level 3.

The text-string form of mathematical formulas read by the function SBML_parseL3Formula() and written by the function SBML_formulaToL3String() uses an expanded version of the syntax read and written by SBML_parseFormula() and SBML_formulaToString(), respectively. The latter two libSBML functions were originally developed to support conversion between SBML Levels 1 and 2, and were focused on the syntax of mathematical formulas used in SBML Level 1. With time, and the use of MathML in SBML Levels 2 and 3, it became clear that supporting Level 2 and 3's expanded mathematical syntax would be useful for software developers. To maintain backwards compatibility for libSBML users, the original SBML_formulaToString() and SBML_parseFormula() have been left untouched, and instead, the new functionality is provided in the form of SBML_parseL3Formula() and SBML_formulaToL3String().

The following lists the main differences in the formula syntax supported by the Level 3 ("L3") versions of the formula parsers and formatters, compared to what is supported by the Level 1-oriented SBML_parseFormula() and SBML_formulaToString():

  • Units may be asociated with bare numbers, using the following syntax:
    number unit
    The number may be in any form (an integer, real, or rational number), and the unit must conform to the syntax of an SBML identifier (technically, the type defined as SId in the SBML specifications). The whitespace between number and unit is optional.
  • The Boolean function symbols && (and), || (or), ! (not), and != (not equals) may be used.
  • The modulo operation is allowed as the symbol % and will produce a <piecewise> function in the corresponding MathML output by default, or can produce the MathML function rem, depending on the L3ParserSettings object (see L3ParserSettings_setParseModuloL3v2() ).
  • All inverse trigonometric functions may be defined in the infix either using arc as a prefix or simply a; in other words, both arccsc and acsc are interpreted as the operator arccosecant as defined in MathML 2.0. (Many functions in the simpler SBML Level 1 oriented parser implemented by SBML_parseFormula() are defined this way as well, but not all.)
  • The following expression is parsed as a rational number instead of as a numerical division:
       (integer/integer)
    Spaces are not allowed in this construct; in other words, "(3 / 4)" (with whitespace between the numbers and the operator) will be parsed into the MathML <divide> construct rather than a rational number. You can, however, assign units to a rational number as a whole; here is an example: "(3/4) ml". (In the case of division rather than a rational number, units are not interpreted in this way.)
  • Various parser and formatter behaviors may be altered through the use of a L3ParserSettings object in conjunction with the functions SBML_parseL3FormulaWithSettings() and SBML_formulaToL3StringWithSettings() The settings available include the following:
    • The function log with a single argument ("log(x)") can be parsed as log10(x), ln(x), or treated as an error, as desired.

    • Unary minus signs can be collapsed or preserved; that is, sequential pairs of unary minuses (e.g., "- -3") can be removed from the input entirely and single unary minuses can be incorporated into the number node, or all minuses can be preserved in the AST node structure.

    • Parsing of units embedded in the input string can be turned on and off.

    • The string avogadro can be parsed as a MathML csymbol or as an identifier.

    • The string % can be parsed either as a piecewise function or as the 'rem' function: a % b will either become

      piecewise(a - b*ceil(a/b), xor((a < 0), (b < 0)), a - b*floor(a/b))

      or

      rem(a, b).

      The latter is simpler, but the rem MathML is only allowed as of SBML Level 3 Version 2.

    • A Model object may optionally be provided to the parser using the variant function call SBML_parseL3FormulaWithModel() or stored in a L3ParserSettings object passed to the variant function SBML_parseL3FormulaWithSettings(). When a Model object is provided, identifiers (values of type SId ) from that model are used in preference to pre-defined MathML definitions for both symbols and functions. More precisely:

      • In the case of symbols: the Model entities whose identifiers will shadow identical symbols in the mathematical formula are: Species, Compartment, Parameter, Reaction, and SpeciesReference. For instance, if the parser is given a Model containing a Species with the identifier "pi", and the formula to be parsed is "3*pi", the MathML produced will contain the construct <ci> pi </ci> instead of the construct <pi/>.

      • In the case of user-defined functions: when a Model object is provided, SId values of user-defined functions present in the model will be used preferentially over pre-defined MathML functions. For example, if the passed-in Model contains a FunctionDefinition object with the identifier "sin", that function will be used instead of the predefined MathML function <sin/>.

    • An SBMLNamespaces object may optionally be provided to identify SBML Level 3 packages that extend the syntax understood by the formula parser. When the namespaces are provided, the parser will interpret possible additional syntax defined by the libSBML plug-ins implementing the SBML Level 3 packages; for example, it may understand vector/array extensions introduced by the SBML Level 3 Arrays package.

These configuration settings cannot be changed directly using the basic parser and formatter functions, but can be changed on a per-call basis by using the alternative functions SBML_parseL3FormulaWithSettings() and SBML_formulaToL3StringWithSettings().

Neither SBML nor the MathML standard define a "string-form" equivalent to MathML expressions. The approach taken by libSBML is to start with the formula syntax defined by SBML Level 1 (which in fact used a custom text-string representation of formulas, and not MathML), and expand it to include the functionality described above. This formula syntax is based mostly on C programming syntax, and may contain operators, function calls, symbols, and white space characters. The following table provides the precedence rules for the different entities that may appear in formula strings.

Token Operation Class Preced. Assoc.
namesymbol referenceoperand8n/a
(expression)expression groupingoperand8n/a
f(...)function callprefix8left
^powerbinary7left
-, !negation, Boolean 'not'unary6right
*, /, %multip., div., modulobinary5left
+, -addition and subtractionbinary4left
==, <, >, <=, >=, !=Boolean comparisonsbinary3left
&&, ||Boolean 'and' and 'or'binary2left
,argument delimiterbinary1left
Expression operators and their precedence in the "Level 3" text-string format for mathematical expressions.

In the table above, operand implies the construct is an operand, prefix implies the operation is applied to the following arguments, unary implies there is one argument, and binary implies there are two arguments. The values in the Precedence column show how the order of different types of operation are determined. For example, the expression a + b * c is evaluated as a + (b * c) because the * operator has higher precedence. The Associates column shows how the order of similar precedence operations is determined; for example, a && b || c is evaluated as (a && b) || c because the && and || operators are left-associative and have the same precedence.

The function call syntax consists of a function name, followed by optional white space, followed by an opening parenthesis token, followed by a sequence of zero or more arguments separated by commas (with each comma optionally preceded and/or followed by zero or more white space characters), followed by a closing parenthesis token. The function name must be chosen from one of the pre-defined functions in SBML or a user-defined function in the model. The following table lists the names of certain common mathematical functions; this table corresponds to Table 6 in the SBML Level 1 Version 2 specification with additions based on the functions added in SBML Level 2 and Level 3:

Name Argument(s) Formula or meaning Argument Constraints Result constraints
abs x Absolute value of x.
acos, arccos x Arccosine of x in radians. –1.0 ≤ x ≤ 1.0 0 ≤ acos(x) ≤ π
acosh, arccosh x Hyperbolic arccosine of x in radians.
acot, arccot x Arccotangent of x in radians.
acoth, arccoth x Hyperbolic arccotangent of x in radians.
acsc, arccsc x Arccosecant of x in radians.
acsch, arccsch x Hyperbolic arccosecant of x in radians.
asec, arcsec x Arcsecant of x in radians.
asech, arcsech x Hyperbolic arcsecant of x in radians.
asin, arcsin xArcsine of x in radians. –1.0 ≤ x ≤ 1.0 0 ≤ asin(x) ≤ π
atan, arctan x Arctangent of x in radians. 0 ≤ atan(x) ≤ π
atanh, arctanh x Hyperbolic arctangent of x in radians.
ceil, ceiling x Smallest number not less than x whose value is an exact integer.
cos x Cosine of x
cosh x Hyperbolic cosine of x.
cot x Cotangent of x.
coth x Hyperbolic cotangent of x.
csc x Cosecant of x.
csch x Hyperbolic cosecant of x.
delay x, y The value of x at y time units in the past.
factorial n The factorial of n. Factorials are defined by n! = n*(n–1)* ... * 1. n must be an integer.
exp x e x, where e is the base of the natural logarithm.
floor x The largest number not greater than x whose value is an exact integer.
ln x Natural logarithm of x. x > 0
log x By default, the base 10 logarithm of x, but can be set to be the natural logarithm of x, or to be an illegal construct. x > 0
log x, y The base x logarithm of y. y > 0
log10 x Base 10 logarithm of x. x > 0
piecewise x1, y1, [x2, y2,] [...] [z] A piecewise function: if (y1), x1. Otherwise, if (y2), x2, etc. Otherwise, z. y1, y2, y3 [etc] must be Boolean
pow, power x, y x y.
root b, x The root base b of x.
sec x Secant of x.
sech x Hyperbolic secant of x.
sqr x x2.
sqrt x x. x > 0 sqrt(x) ≥ 0
sin x Sine of x.
sinh x Hyperbolic sine of x.
tan x Tangent of x. x ≠ n*π/2, for odd integer n
tanh x Hyperbolic tangent of x.
and x, y, z... Boolean and(x, y, z...): returns true if all of its arguments are true. Note that and is an n-ary function, taking 0 or more arguments, and that and() returns true. All arguments must be Boolean
not x Boolean not(x) x must be Boolean
or x, y, z... Boolean or(x, y, z...): returns true if at least one of its arguments is true. Note that or is an n-ary function, taking 0 or more arguments, and that or() returns false. All arguments must be Boolean
xor x, y, z... Boolean xor(x, y, z...): returns true if an odd number of its arguments is true. Note that xor is an n-ary function, taking 0 or more arguments, and that xor() returns false. All arguments must be Boolean
eq x, y, z... Boolean eq(x, y, z...): returns true if all arguments are equal. Note that eq is an n-ary function, but must take 2 or more arguments.
geq x, y, z... Boolean geq(x, y, z...): returns true if each argument is greater than or equal to the argument following it. Note that geq is an n-ary function, but must take 2 or more arguments.
gt x, y, z... Boolean gt(x, y, z...): returns true if each argument is greater than the argument following it. Note that gt is an n-ary function, but must take 2 or more arguments.
leq x, y, z... Boolean leq(x, y, z...): returns true if each argument is less than or equal to the argument following it. Note that leq is an n-ary function, but must take 2 or more arguments.
lt x, y, z... Boolean lt(x, y, z...): returns true if each argument is less than the argument following it. Note that lt is an n-ary function, but must take 2 or more arguments.
neq x, y Boolean x != y: returns true unless x and y are equal.
plus x, y, z... x + y + z + ...: The sum of the arguments of the function. Note that plus is an n-ary function taking 0 or more arguments, and that plus() returns 0.
times x, y, z... x * y * z * ...: The product of the arguments of the function. Note that times is an n-ary function taking 0 or more arguments, and that times() returns 1.
minus x, y xy.
divide x, y x / y.
Mathematical functions defined in the "Level 3" text-string formula syntax.

Parsing of the various MathML functions and constants are all case-insensitive by default: function names such as cos, Cos and COS are all parsed as the MathML cosine operator, <cos>. However, when a Model object is used in conjunction with either SBML_parseL3FormulaWithModel() or SBML_parseL3FormulaWithSettings(), any identifiers found in that model will be parsed in a case-sensitive way. For example, if a model contains a Species having the identifier Pi, the parser will parse "Pi" in the input as "<ci> Pi </ci>" but will continue to parse the symbols "pi" and "PI" as "<pi>".

As mentioned above, the manner in which the "L3" versions of the formula parser and formatter interpret the function "log" can be changed. To do so, callers should use the function SBML_parseL3FormulaWithSettings() and pass it an appropriate L3ParserSettings object. By default, unlike the SBML Level 1 parser implemented by SBML_parseFormula(), the string "log" is interpreted as the base 10 logarithm, and not as the natural logarithm. However, you can change the interpretation to be base-10 log, natural log, or as an error; since the name "log" by itself is ambiguous, you require that the parser uses log10 or ln instead, which are more clear. Please refer to SBML_parseL3FormulaWithSettings().

In addition, the following symbols will be translated to their MathML equivalents, if no symbol with the same SId identifier string exists in the Model object provided:

Name Meaning MathML
true Boolean value true <true/>
false Boolean value false <false/>
pi Mathematical constant pi <pi/>
avogadro Value of Avogadro's constant stipulated by SBML <csymbol encoding="text" definitionURL="http://www.sbml.org/sbml/symbols/avogadro"> avogadro </csymbol/>
time Simulation time as defined in SBML <csymbol encoding="text" definitionURL="http://www.sbml.org/sbml/symbols/time"> time </csymbol/>
inf, infinity Mathematical constant "infinity" <infinity/>
nan, notanumber Mathematical concept "not a number" <notanumber/>
Mathematical symbols defined in the "Level 3" text-string formula syntax.

Again, as mentioned above, whether the string "avogadro" is parsed as an AST node of type AST_NAME_AVOGADRO or AST_NAME is configurable; use the version of the parser function called SBML_parseL3FormulaWithSettings(). This Avogadro-related functionality is provided because SBML Level 2 models may not use AST_NAME_AVOGADRO AST nodes.

Parameters
treethe AST to be converted.
Returns
the formula from the given AST as text string, with a syntax oriented towards the capabilities defined in SBML Level 3. The caller owns the returned string and is responsible for freeing it when it is no longer needed. If tree is a null pointer, then a null pointer is returned.
See also
SBML_formulaToL3StringWithSettings()
SBML_formulaToString()
SBML_parseL3FormulaWithSettings()
SBML_parseL3FormulaWithModel()
SBML_parseFormula()
L3ParserSettings
SBML_getDefaultL3ParserSettings()
SBML_getLastParseL3Error()

◆ SBML_formulaToL3StringWithSettings()

char* SBML_formulaToL3StringWithSettings ( const ASTNode_t tree,
const L3ParserSettings_t settings 
)

Converts an AST to a text string representation of a formula, using specific formatter settings.

This function behaves identically to SBML_formulaToL3String() but its behavior is controlled by two fields in the settings object, namely:

  • parseunits ("parse units"): If this field in the settings object is set to true (the default), the function will write out the units of any numerical ASTNodes that have them, producing (for example) "3 mL", "(3/4) m", or "5.5e-10 M". If this is set to false, this function will only write out the number itself ("3", "(3/4)", and "5.5e-10", in the previous examples).
  • collapseminus ("collapse minus"): If this field in the settings object is set to false (the default), the function will write out explicitly any doubly-nested unary minus ASTNodes, producing (for example) "- -x" or even "- - - - -3.1". If this is set to true, the function will collapse the nodes before producing the infix form, producing "x" and "-3.1" in the previous examples.

All the other settings of the L3ParserSettings object passed in as settings will be ignored for the purposes of this function: the parselog ("parse log") setting is ignored so that "log10(x)", "ln(x)", and "log(x, y)" are always produced; the avocsymbol ("Avogadro csymbol") is irrelevant to the behavior of this function; and nothing in the Model object set via the model setting is used.

Parameters
treethe AST to be converted.
settingsthe L3ParserSettings object used to modify the behavior of this function.
Returns
the formula from the given AST as text string, with a syntax oriented towards the capabilities defined in SBML Level 3. The caller owns the returned string and is responsible for freeing it when it is no longer needed. If tree is a null pointer, then a null pointer is returned.
See also
SBML_formulaToL3String()
SBML_formulaToString()
SBML_parseL3FormulaWithSettings()
SBML_parseL3FormulaWithModel()
SBML_parseFormula()
L3ParserSettings
SBML_getDefaultL3ParserSettings()
SBML_getLastParseL3Error()