Basic concepts
Converting between ASTs and text strings
The text-string formula syntax, and differences with MathML
- Simpler scheme based on SBML Level 1's syntax
- Advanced, SBML Level 3-oriented formula scheme
Methods for working directly with libSBML's Abstract Syntax Trees
Reading and Writing MathML directly

This section describes libSBML's facilities for working with SBML representations of mathematical expressions.

Internally, libSBML uses Abstract Syntax Trees (ASTs) to provide a canonical, in-memory representation for all mathematical formulas regardless of their original format (i.e., C-like infix text strings or the XML-based MathML 2.0 format). LibSBML provides an extensive API for working with ASTs; it also provides facilities for translating between ASTs and mathematical formulas writing in a text-string notation, as well as translating between ASTs and MathML.

Basic concepts

: An AST node in libSBML is a recursive tree structure; each node has a type, a pointer to a value, and a list of children nodes. Each ASTNode node may have none, one, two, or more children depending on its type. There are node types to represent numbers (with subtypes to distinguish integer, real, and rational numbers), names (e.g., constants or variables), simple mathematical operators, logical or relational operators and functions. The following diagram illustrates an example of how the mathematical expression "1 + 2" is represented as an AST with one plus node having two integer children nodes for the numbers 1 and 2. The figure also shows the corresponding MathML representation:

Example AST representation of a mathematical expression.
Infix	AST	MathML
`1 + 2`		`<math xmlns="http://www.w3.org/1998/Math/MathML">` `<apply>` `<plus/>` `<cn type="integer"> 1 </cn>` `<cn type="integer"> 2 </cn>` `</apply>` `</math>`

The following are other noteworthy points about the AST representation in libSBML:

A numerical value represented in MathML as a real number with an exponent is preserved as such in the AST node representation, even if the number could be stored in a double data type. This is done so that when an SBML model is read in and then written out again, the amount of change introduced by libSBML to the SBML during the round-trip activity is minimized.

Rational numbers are represented in an AST node using separate numerator and denominator values. These can be retrieved using the methods ASTNode::getNumerator() and ASTNode::getDenominator().

The children of an ASTNode are other ASTNode objects. The list of children is empty for nodes that are leaf elements, such as numbers. For nodes that are actually roots of expression subtrees, the list of children points to the parsed objects that make up the rest of the expression.

For many applications, the details of ASTs are irrelevant because libSBML provides text-string based translation functions such as SBML_formulaToL3String() and SBML_parseL3Formula(). If you find the complexity of using the AST representation of expressions too high for your purposes, perhaps the string-based functions will be more suitable.

Converting between ASTs and text strings

SBML Levels 2 and 3 represent mathematical expressions using using MathML 2.0 (more specifically, a subset of the content portion of MathML 2.0), but most applications using libSBML do not use MathML directly. Instead, applications generally interact with mathematics using either the API for Abstract Syntax Trees (described below), or using libSBML's facilities for encoding and decoding mathematical formulas to/from text strings. The latter is simpler to use directly, so we describe it first.

The libSBML formula parser has been carefully engineered so that transformations from MathML to the libSBML infix text notation and back is possible with a minimum of disruption to the structure of the mathematical expression. The example below shows a simple program that, when run, takes a MathML string compiled into the program, converts it to an AST, converts that to an infix representation of the formula, compares it to the expected form of that formula, and finally translates that formula back to MathML and displays it. The output displayed on the terminal should have the same structure as the MathML it started with. The program is a simple example of using libSBML's basic MathML and AST reading and writing methods, and shows that libSBML preserves the ordering and structure of the mathematical expressions.

#include <iostream>
#include <sbml/SBMLTypes.h>
int
main (int argc, char *argv[])
{
  const char* expected = "1 + f(x)";
  const char* s = "<?xml version='1.0' encoding='UTF-8'?>"
    "<math xmlns='http://www.w3.org/1998/Math/MathML'>"
    "  <apply> <plus/> <cn> 1 </cn>"
    "                  <apply> <ci> f </ci> <ci> x </ci> </apply>"
    "  </apply>"
    "</math>";
  ASTNode* ast    = readMathMLFromString(s);
  char*    result = SBML_formulaToL3String(ast);
  if ( strcmp(result, expected) == 0 )
    std::cout << "Got expected result" << std::endl;
  else
    std::cout << "Mismatch after readMathMLFromString()" << std::endl;
  ASTNode* new_mathml = SBML_parseL3Formula(result);
  char*    new_s      = writeMathMLToString(new_mathml);
  std::cout << "Result of writing AST:" << std::endl << new_s << std::endl;
}

The text-string form of mathematical formulas written by SBML_formulaToString() and SBML_formulaToL3String(), and read by SBML_parseFormula() and SBML_parseL3Formula(), use a simple C-inspired infix notation. It is summarized in the next section below. A formula in this text-string form therefore can be handed to a program that understands SBML mathematical expressions, or used as part of a translation system.

The text-string formula syntax, and differences with MathML

There are actually two text-based formula parsing/writing systems in libSBML: one that uses a more limited syntax and was originally designed for translation between SBML Level 1 (which used a text-string format for representing mathematics) and higher levels of SBML, and a more recent, more powerful version that offers features to support SBML Level 3. We describe both below, beginning with the simpler but more limited system.

Simpler scheme based on SBML Level 1's syntax

The simpler, more limited translation system is read by SBML_parseFormula() and written by SBML_formulaToString(). It uses an infix notation essentially derived from the syntax of the C programming language and was originally used in SBML Level 1. We summarize the syntax here, but for more complete details, readers should consult the documentation for SBML_parseFormula().

Formula strings in this infix notation may contain operators, function calls, symbols, and white space characters. The allowable white space characters are tab and space. The following are illustrative examples of formulas expressed in the syntax:

0.10 * k4^2

(vm * s1)/(km + s1)

The following table shows the precedence rules in this syntax. In the Class column, operand implies the construct is an operand, prefix implies the operation is applied to the following arguments, unary implies there is one argument, and binary implies there are two arguments. The values in the Precedence column show how the order of different types of operation are determined. For example, the expression a + b * c is evaluated as a + (b * c) because the * operator has higher precedence. The Associates column shows how the order of similar precedence operations is determined; for example, a - b + c is evaluated as (a - b) + c because the + and - operators are left-associative. The precedence and associativity rules are taken from the C programming language, except for the symbol ^, which is used in C for a different purpose. (Exponentiation can be invoked using either ^ or the function power.)

A table of the expression operators and their precedence in the text-string format for mathematical expressions used by SBML_parseFormula().
Token	Operation	Class	Precedence	Associates
name	symbol reference	operand	6	n/a
`(`expression`)`	expression grouping	operand	6	n/a
`f(`...`)`	function call	prefix	6	left
`-`	negation	unary	5	right
`^`	power	binary	4	left
`*`	multiplication	binary	3	left
`/`	divison	binary	3	left
`+`	addition	binary	2	left
`-`	subtraction	binary	2	left
`,`	argument delimiter	binary	1	left

A program parsing a formula in an SBML model should assume that names appearing in the formula are the identifiers of Species, Parameter, Compartment, FunctionDefinition, (in Level 2) Reaction, or (in Level 3) SpeciesReference objects defined in a model. When a function call is involved, the syntax consists of a function identifier, followed by optional white space, followed by an opening parenthesis, followed by a sequence of zero or more arguments separated by commas (with each comma optionally preceded and/or followed by zero or more white space characters), followed by a closing parenthesis. There is an almost one-to-one mapping between the list of predefined functions available, and those defined in MathML. All of the MathML functions are recognized; this set is larger than the functions defined in SBML Level 1. In the subset of functions that overlap between MathML and SBML Level 1, there exist a few differences. The following table summarizes the differences between the predefined functions in SBML Level 1 and the MathML equivalents in SBML Levels 2 and 3:

Table comparing the names of certain functions in the SBML text-string formula syntax and MathML. The left column shows the names of functions recognized by SBML_parseFormula(); the right column shows their equivalent function names in MathML 2.0, used in SBML Levels 2 and 3.
Text string formula functions	MathML equivalents in SBML Levels 2 and 3
`acos`	`arccos`
`asin`	`arcsin`
`atan`	`arctan`
`ceil`	`ceiling`
`log`	`ln`
`log10(x)`	`log(x)` or `log(10, x)`
`pow(x, y)`	`power(x, y)`
`sqr(x)`	`power(x, 2)`
`sqrt(x)`	`root(x)` or `root(2, x)`

Note that there are differences between the symbols used to represent the common mathematical functions and the corresponding MathML token names. This is a potential source of incompatibilities. Note in particular that in this text-string syntax, log(x) always represents the natural logarithm, whereas in MathML, the natural logarithm is <ln/>. Application writers are urged to be careful when translating between text forms and MathML forms, especially if they provide a direct text-string input facility to users of their software systems. The more advanced mathematical formula system, described below, offers the ability to control how log is interpreted as well as other parsing behaviors.

Advanced, SBML Level 3-oriented formula scheme

: The text-string form of mathematical formulas read by the function SBML_parseL3Formula() and written by the function SBML_formulaToL3String() uses an expanded version of the syntax read and written by SBML_parseFormula() and SBML_formulaToString(), respectively. The latter two libSBML functions were originally developed to support conversion between SBML Levels 1 and 2, and were focused on the syntax of mathematical formulas used in SBML Level 1. With time, and the use of MathML in SBML Levels 2 and 3, it became clear that supporting Level 2 and 3's expanded mathematical syntax would be useful for software developers. To maintain backwards compatibility for libSBML users, the original SBML_formulaToString() and SBML_parseFormula() have been left untouched, and instead, the new functionality is provided in the form of SBML_parseL3Formula() and SBML_formulaToL3String().

The following lists the main differences in the formula syntax supported by the Level 3 ("L3") versions of the formula parsers and formatters, compared to what is supported by the Level 1-oriented SBML_parseFormula() and SBML_formulaToString():

Units may be asociated with bare numbers, using the following syntax:
number unit
The number may be in any form (an integer, real, or rational number), and the unit must conform to the syntax of an SBML identifier (technically, the type defined as SId in the SBML specifications). The whitespace between number and unit is optional.

The Boolean function symbols && (and), || (or), ! (not), and != (not equals) may be used.

The modulo operation is allowed as the symbol % and will produce a <piecewise> function in the corresponding MathML output by default, or can produce the MathML function rem, depending on the L3ParserSettings object (see L3ParserSettings_setParseModuloL3v2() ).

All inverse trigonometric functions may be defined in the infix either using arc as a prefix or simply a; in other words, both arccsc and acsc are interpreted as the operator arccosecant as defined in MathML 2.0. (Many functions in the simpler SBML Level 1 oriented parser implemented by SBML_parseFormula() are defined this way as well, but not all.)

The following expression is parsed as a rational number instead of as a numerical division:
```
   (integer/integer)
```
Spaces are not allowed in this construct; in other words, "(3 / 4)" (with whitespace between the numbers and the operator) will be parsed into the MathML <divide> construct rather than a rational number. You can, however, assign units to a rational number as a whole; here is an example: "(3/4) ml". (In the case of division rather than a rational number, units are not interpreted in this way.)

Various parser and formatter behaviors may be altered through the use of a L3ParserSettings object in conjunction with the functions SBML_parseL3FormulaWithSettings() and SBML_formulaToL3StringWithSettings() The settings available include the following:
- The function log with a single argument ("log(x)") can be parsed as log10(x), ln(x), or treated as an error, as desired.
- Unary minus signs can be collapsed or preserved; that is, sequential pairs of unary minuses (e.g., "- -3") can be removed from the input entirely and single unary minuses can be incorporated into the number node, or all minuses can be preserved in the AST node structure.
- Parsing of units embedded in the input string can be turned on and off.
- The string avogadro can be parsed as a MathML csymbol or as an identifier.
- The string % can be parsed either as a piecewise function or as the 'rem' function: a % b will either become
  
  piecewise(a - b*ceil(a/b), xor((a < 0), (b < 0)), a - b*floor(a/b))
  
  or
  
  rem(a, b).
  
  The latter is simpler, but the rem MathML is only allowed as of SBML Level 3 Version 2.
- A Model object may optionally be provided to the parser using the variant function call SBML_parseL3FormulaWithModel() or stored in a L3ParserSettings object passed to the variant function SBML_parseL3FormulaWithSettings(). When a Model object is provided, identifiers (values of type SId ) from that model are used in preference to pre-defined MathML definitions for both symbols and functions. More precisely:
  - In the case of symbols: the Model entities whose identifiers will shadow identical symbols in the mathematical formula are: Species, Compartment, Parameter, Reaction, and SpeciesReference. For instance, if the parser is given a Model containing a Species with the identifier "pi", and the formula to be parsed is "3*pi", the MathML produced will contain the construct <ci> pi </ci> instead of the construct <pi/>.
  - In the case of user-defined functions: when a Model object is provided, SId values of user-defined functions present in the model will be used preferentially over pre-defined MathML functions. For example, if the passed-in Model contains a FunctionDefinition object with the identifier "sin", that function will be used instead of the predefined MathML function <sin/>.
- An SBMLNamespaces object may optionally be provided to identify SBML Level 3 packages that extend the syntax understood by the formula parser. When the namespaces are provided, the parser will interpret possible additional syntax defined by the libSBML plug-ins implementing the SBML Level 3 packages; for example, it may understand vector/array extensions introduced by the SBML Level 3 Arrays package.

These configuration settings cannot be changed directly using the basic parser and formatter functions, but can be changed on a per-call basis by using the alternative functions SBML_parseL3FormulaWithSettings() and SBML_formulaToL3StringWithSettings().

Neither SBML nor the MathML standard define a "string-form" equivalent to MathML expressions. The approach taken by libSBML is to start with the formula syntax defined by SBML Level 1 (which in fact used a custom text-string representation of formulas, and not MathML), and expand it to include the functionality described above. This formula syntax is based mostly on C programming syntax, and may contain operators, function calls, symbols, and white space characters. The following table provides the precedence rules for the different entities that may appear in formula strings.

Expression operators and their precedence in the "Level 3" text-string format for mathematical expressions.
Token	Operation	Class	Preced.	Assoc.
name	symbol reference	operand	8	n/a
`(`expression`)`	expression grouping	operand	8	n/a
`f(`...`)`	function call	prefix	8	left
`^`	power	binary	7	left
`-, !`	negation, Boolean 'not'	unary	6	right
`*, /, %`	multip., div., modulo	binary	5	left
`+, -`	addition and subtraction	binary	4	left
`==, <, >, <=, >=, !=`	Boolean comparisons	binary	3	left
`&&, \|\|`	Boolean 'and' and 'or'	binary	2	left
`,`	argument delimiter	binary	1	left

In the table above, operand implies the construct is an operand, prefix implies the operation is applied to the following arguments, unary implies there is one argument, and binary implies there are two arguments. The values in the Precedence column show how the order of different types of operation are determined. For example, the expression a + b * c is evaluated as a + (b * c) because the * operator has higher precedence. The Associates column shows how the order of similar precedence operations is determined; for example, a && b || c is evaluated as (a && b) || c because the && and || operators are left-associative and have the same precedence.

The function call syntax consists of a function name, followed by optional white space, followed by an opening parenthesis token, followed by a sequence of zero or more arguments separated by commas (with each comma optionally preceded and/or followed by zero or more white space characters), followed by a closing parenthesis token. The function name must be chosen from one of the pre-defined functions in SBML or a user-defined function in the model. The following table lists the names of certain common mathematical functions; this table corresponds to Table 6 in the SBML Level 1 Version 2 specification with additions based on the functions added in SBML Level 2 and Level 3:

Mathematical functions defined in the "Level 3" text-string formula syntax.
Name	Argument(s)	Formula or meaning	Argument Constraints	Result constraints
`abs`	x	Absolute value of x.
`acos`, `arccos`	x	Arccosine of x in radians.	–1.0 ≤ x ≤ 1.0	0 ≤ acos(x) ≤ π
`acosh`, `arccosh`	x	Hyperbolic arccosine of x in radians.
`acot`, `arccot`	x	Arccotangent of x in radians.
`acoth`, `arccoth`	x	Hyperbolic arccotangent of x in radians.
`acsc`, `arccsc`	x	Arccosecant of x in radians.
`acsch`, `arccsch`	x	Hyperbolic arccosecant of x in radians.
`asec`, `arcsec`	x	Arcsecant of x in radians.
`asech`, `arcsech`	x	Hyperbolic arcsecant of x in radians.
`asin`, `arcsin`	x	Arcsine of x in radians.	–1.0 ≤ x ≤ 1.0	0 ≤ asin(x) ≤ π
`atan`, `arctan`	x	Arctangent of x in radians.		0 ≤ atan(x) ≤ π
`atanh`, `arctanh`	x	Hyperbolic arctangent of x in radians.
`ceil`, `ceiling`	x	Smallest number not less than x whose value is an exact integer.
`cos`	x	Cosine of x
`cosh`	x	Hyperbolic cosine of x.
`cot`	x	Cotangent of x.
`coth`	x	Hyperbolic cotangent of x.
`csc`	x	Cosecant of x.
`csch`	x	Hyperbolic cosecant of x.
`delay`	x, y	The value of x at y time units in the past.
`factorial`	n	The factorial of n. Factorials are defined by n! = n(n–1)* ... * 1*.	n must be an integer.
`exp`	x	e^x, where e is the base of the natural logarithm.
`floor`	x	The largest number not greater than x whose value is an exact integer.
`ln`	x	Natural logarithm of x.	x > 0
`log`	x	By default, the base 10 logarithm of x, but can be set to be the natural logarithm of x, or to be an illegal construct.	x > 0
`log`	x, y	The base x logarithm of y.	y > 0
`log10`	x	Base 10 logarithm of x.	x > 0
`piecewise`	x1, y1, [x2, y2,] [...] [z]	A piecewise function: if (y1), x1. Otherwise, if (y2), x2, etc. Otherwise, z.	y1, y2, y3 [etc] must be Boolean
`pow`, `power`	x, y	x^y.
`root`	b, x	The root base b of x.
`sec`	x	Secant of x.
`sech`	x	Hyperbolic secant of x.
`sqr`	x	x².
`sqrt`	x	√x.	x > 0	sqrt(x) ≥ 0
`sin`	x	Sine of x.
`sinh`	x	Hyperbolic sine of x.
`tan`	x	Tangent of x.	x ≠ n*π/2, for odd integer n
`tanh`	x	Hyperbolic tangent of x.
`and`	x, y, z...	Boolean and(x, y, z...): returns `true` if all of its arguments are true. Note that `and` is an n-ary function, taking 0 or more arguments, and that `and()` returns `true`.	All arguments must be Boolean
`not`	x	Boolean not(x)	x must be Boolean
`or`	x, y, z...	Boolean or(x, y, z...): returns `true` if at least one of its arguments is true. Note that `or` is an n-ary function, taking 0 or more arguments, and that `or()` returns `false`.	All arguments must be Boolean
`xor`	x, y, z...	Boolean xor(x, y, z...): returns `true` if an odd number of its arguments is true. Note that `xor` is an n-ary function, taking 0 or more arguments, and that `xor()` returns `false`.	All arguments must be Boolean
`eq`	x, y, z...	Boolean eq(x, y, z...): returns `true` if all arguments are equal. Note that `eq` is an n-ary function, but must take 2 or more arguments.
`geq`	x, y, z...	Boolean geq(x, y, z...): returns `true` if each argument is greater than or equal to the argument following it. Note that `geq` is an n-ary function, but must take 2 or more arguments.
`gt`	x, y, z...	Boolean gt(x, y, z...): returns `true` if each argument is greater than the argument following it. Note that `gt` is an n-ary function, but must take 2 or more arguments.
`leq`	x, y, z...	Boolean leq(x, y, z...): returns `true` if each argument is less than or equal to the argument following it. Note that `leq` is an n-ary function, but must take 2 or more arguments.
`lt`	x, y, z...	Boolean lt(x, y, z...): returns `true` if each argument is less than the argument following it. Note that `lt` is an n-ary function, but must take 2 or more arguments.
`neq`	x, y	Boolean x != y: returns `true` unless x and y are equal.
`plus`	x, y, z...	x + y + z + ...: The sum of the arguments of the function. Note that `plus` is an n-ary function taking 0 or more arguments, and that `plus()` returns `0`.
`times`	x, y, z...	x * y * z * ...: The product of the arguments of the function. Note that `times` is an n-ary function taking 0 or more arguments, and that `times()` returns `1`.
`minus`	x, y	x – y.
`divide`	x, y	x / y.

Parsing of the various MathML functions and constants are all case-insensitive by default: function names such as cos, Cos and COS are all parsed as the MathML cosine operator, <cos>. However, when a Model object is used in conjunction with either SBML_parseL3FormulaWithModel() or SBML_parseL3FormulaWithSettings(), any identifiers found in that model will be parsed in a case-sensitive way. For example, if a model contains a Species having the identifier Pi, the parser will parse "Pi" in the input as "<ci> Pi </ci>" but will continue to parse the symbols "pi" and "PI" as "<pi>".

As mentioned above, the manner in which the "L3" versions of the formula parser and formatter interpret the function "log" can be changed. To do so, callers should use the function SBML_parseL3FormulaWithSettings() and pass it an appropriate L3ParserSettings object. By default, unlike the SBML Level 1 parser implemented by SBML_parseFormula(), the string "log" is interpreted as the base 10 logarithm, and not as the natural logarithm. However, you can change the interpretation to be base-10 log, natural log, or as an error; since the name "log" by itself is ambiguous, you require that the parser uses log10 or ln instead, which are more clear. Please refer to SBML_parseL3FormulaWithSettings().

In addition, the following symbols will be translated to their MathML equivalents, if no symbol with the same SId identifier string exists in the Model object provided:

Mathematical symbols defined in the "Level 3" text-string formula syntax.
Name	Meaning	MathML
`true`	Boolean value `true`	`<true/>`
`false`	Boolean value `false`	`<false/>`
`pi`	Mathematical constant pi	`<pi/>`
`avogadro`	Value of Avogadro's constant stipulated by SBML	`<csymbol encoding="text" definitionURL="http://www.sbml.org/sbml/symbols/avogadro"> avogadro </csymbol/>`
`time`	Simulation time as defined in SBML	`<csymbol encoding="text" definitionURL="http://www.sbml.org/sbml/symbols/time"> time </csymbol/>`
`inf`, `infinity`	Mathematical constant "infinity"	`<infinity/>`
`nan`, `notanumber`	Mathematical concept "not a number"	`<notanumber/>`

Again, as mentioned above, whether the string "avogadro" is parsed as an AST node of type AST_NAME_AVOGADRO or AST_NAME is configurable; use the version of the parser function called SBML_parseL3FormulaWithSettings(). This Avogadro-related functionality is provided because SBML Level 2 models may not use AST_NAME_AVOGADRO AST nodes.

Methods for working directly with libSBML's Abstract Syntax Trees

While it is convenient to read and write mathematical expressions in the form of text strings, advanced applications usually need more powerful ways of creating, traversing, and modifying mathematical formulas. For this reason, libSBML provides a rich API for interacting with ASTs directly. This section summarizes these facilities; for more information, readers should consult the documentation for the ASTNode class.

: Every ASTNode has an associated type code to indicate whether, for example, it holds a number or stands for an arithmetic operator. The type is recorded as a value drawn from the enumeration ASTNodeType_t. The list of possible types is quite long, because it covers all the mathematical functions that are permitted in SBML. The values are shown in the following table:

AST_CONSTANT_E	AST_FUNCTION_CSC	AST_LOGICAL_AND
AST_CONSTANT_FALSE	AST_FUNCTION_CSCH	AST_LOGICAL_IMPLIES²
AST_CONSTANT_PI	AST_FUNCTION_DELAY	AST_LOGICAL_NOT
AST_CONSTANT_TRUE	AST_FUNCTION_EXP	AST_LOGICAL_OR
AST_DIVIDE	AST_FUNCTION_FACTORIAL	AST_LOGICAL_XOR
AST_FUNCTION	AST_FUNCTION_FLOOR	AST_MINUS
AST_FUNCTION_ABS	AST_FUNCTION_LN	AST_NAME
AST_FUNCTION_ARCCOS	AST_FUNCTION_LOG	AST_NAME_AVOGADRO¹
AST_FUNCTION_ARCCOSH	AST_FUNCTION_MAX²	AST_NAME_TIME
AST_FUNCTION_ARCCOT	AST_FUNCTION_MIN²	AST_ORIGINATES_IN_PACKAGE²
AST_FUNCTION_ARCCOTH	AST_FUNCTION_PIECEWISE	AST_PLUS
AST_FUNCTION_ARCCSC	AST_FUNCTION_POWER	AST_POWER
AST_FUNCTION_ARCCSCH	AST_FUNCTION_QUOTIENT²	AST_RATIONAL
AST_FUNCTION_ARCSEC	AST_FUNCTION_RATE_OF²	AST_REAL
AST_FUNCTION_ARCSECH	AST_FUNCTION_REM²	AST_REAL_E
AST_FUNCTION_ARCSIN	AST_FUNCTION_ROOT	AST_RELATIONAL_EQ
AST_FUNCTION_ARCSINH	AST_FUNCTION_SEC	AST_RELATIONAL_GEQ
AST_FUNCTION_ARCTAN	AST_FUNCTION_SECH	AST_RELATIONAL_GT
AST_FUNCTION_ARCTANH	AST_FUNCTION_SIN	AST_RELATIONAL_LEQ
AST_FUNCTION_CEILING	AST_FUNCTION_SINH	AST_RELATIONAL_LT
AST_FUNCTION_COS	AST_FUNCTION_TAN	AST_RELATIONAL_NEQ
AST_FUNCTION_COSH	AST_FUNCTION_TANH	AST_TIMES
AST_FUNCTION_COT	AST_INTEGER	AST_UNKNOWN
AST_FUNCTION_COTH	AST_LAMBDA
¹ (Level 3 only)
² (Level 3 Version 2+ only)

The types have the following meanings:

If the node is basic mathematical operator (e.g., "+"), then the node's type will be AST_PLUS, AST_MINUS, AST_TIMES, AST_DIVIDE, or AST_POWER, as appropriate.

If the node is a predefined function or operator from SBML Level 1 (in the string-based formula syntax used in Level 1) or SBML Level 2 and 3 (in the subset of MathML used in SBML Levels 2 and 3), then the node's type will be either AST_FUNCTION_X, AST_LOGICAL_X, or AST_RELATIONAL_X, as appropriate. (Examples: AST_FUNCTION_LOG, AST_RELATIONAL_LEQ.)

If the node refers to a user-defined function, the node's type will be AST_FUNCTION (because it holds the name of the function).

If the node is a lambda expression, its type will be AST_LAMBDA.

If the node is a predefined constant ("ExponentialE", "Pi", "True" or "False"), then the node's type will be AST_CONSTANT_E, AST_CONSTANT_PI, AST_CONSTANT_TRUE, or AST_CONSTANT_FALSE.

(Levels 2 and 3 only) If the node is the special MathML csymbol time, the value of the node will be AST_NAME_TIME. (Note, however, that the MathML csymbol delay is translated into a node of type AST_FUNCTION_DELAY. The difference is due to the fact that time is a single variable, whereas delay is actually a function taking arguments.)

(Level 3 only) If the node is the special MathML csymbol avogadro, the value of the node will be AST_NAME_AVOGADRO.

(Level 3 Version 2+ only) If the node is the special MathML csymbol rateOf, the value of the node will be AST_FUNCTION_RATE_OF.

(Level 3 Version 2+ only) If the node is a MathML operator that originates in a package, it is included in the ASTNodeType_t list, but may not be legally used in an SBML document that does not include that package. This includes the node types from the 'Distributions' package (AST_DISTRIB_FUNCTION_NORMAL, AST_DISTRIB_FUNCTION_UNIFORM, etc.), and elements from MathML that were not included in core.

If the node contains a numerical value, its type will be AST_INTEGER, AST_REAL, AST_REAL_E, or AST_RATIONAL, as appropriate.

: There are a number of methods for interrogating the type of an ASTNode and for testing whether a node belongs to a general category of constructs. The methods on ASTNode for this purpose are the following:

ASTNodeType_t getType() returns the type of this AST node.
bool isConstant() returns true if this AST node is a MathML constant (true, false, pi, exponentiale), false otherwise.
bool isBoolean() returns true if this AST node returns a Boolean value (by being either a logical operator, a relational operator, or the constant true or false).
bool isFunction() returns true if this AST node is a function (i.e., a MathML defined function such as exp or else a function defined by a FunctionDefinition in the Model).
bool isInfinity() returns true if this AST node is the special IEEE 754 value infinity.
bool isInteger() returns true if this AST node is holding an integer value.
bool isNumber() returns true if this AST node is holding any number.
bool isLambda() returns true if this AST node is a MathML lambda construct.
bool isLog10() returns true if this AST node represents the log10 function, specifically, that its type is AST_FUNCTION_LOG and it has two children, the first of which is an integer equal to 10.
bool isLogical() returns true if this AST node is a logical operator (and, or, not, xor).
bool isName() returns true if this AST node is a user-defined name or (in SBML Levels 2 and 3) one of the two special csymbol constructs "delay" or "time".
bool isNaN() returns true if this AST node has the special IEEE 754 value "not a number" (NaN).
bool isNegInfinity() returns true if this AST node has the special IEEE 754 value of negative infinity.
bool isOperator() returns true if this AST node is an operator (e.g., +, -, etc.)
bool isPiecewise() returns true if this AST node is the MathML piecewise function.
bool isRational() returns true if this AST node is a rational number having a numerator and a denominator.
bool isReal() returns true if this AST node is a real number (specifically, AST_REAL_E or AST_RATIONAL).
bool isRelational() returns true if this AST node is a relational operator.
bool isSqrt() returns true if this AST node is the square-root operator
bool isUMinus() returns true if this AST node is a unary minus.
bool isUnknown() returns true if this AST node's type is unknown.

Programs manipulating AST node structures should check the type of a given node before calling methods that return a value from the node. The following are the ASTNode object methods available for returning values from nodes:

long getInteger()
char getCharacter()
const char* getName()
long getNumerator()
long getDenominator()
double getReal()
double getMantissa()
long getExponent()

Of course, all of this would be of little use if libSBML didn't also provide methods for setting the values of AST node objects! And it does. The methods are the following:

void setCharacter(char value) sets the value of this ASTNode to the given character value. If character is one of +, -, *, / or ^, the node type will be to the appropriate operator type. For all other characters, the node type will be set to AST_UNKNOWN.
void setName(const char *name) sets the value of this AST node to the given name. The node type will be set (to AST_NAME) only if the AST node was previously an operator (isOperator(node) != 0) or number (isNumber(node) != 0). This allows names to be set for AST_FUNCTIONs and the like.
void setValue(int value) sets the value of the node to the given integer value. Equivalent to the next method.
void setValue(long value) sets the value of the node to the given integer value. Equivalent to the previous method. No, this is not a Gödelian self-referential loop.
void setValue(long numerator, long denominator) sets the value of this ASTNode to the given rational value in two parts: the numerator and denominator. The node type is set to AST_RATIONAL.
void setValue(double value) sets the value of this ASTNode to the given real (double) value and sets the node type to AST_REAL.
void setValue(double mantissa, long exponent) sets the value of this ASTNode to a real (double) using the two parts given: the mantissa and the exponent. The node type is set to AST_REAL_E.

Finally, ASTNode also defines some miscellaneous methods for manipulating ASTs:

ASTNode* ASTNode(ASTNodeType_t type) creates a new ASTNode object and returns a pointer to it. The returned node will have the given type, or a type of AST_UNKNOWN if no argument type is explicitly given or the type code is unrecognized.
unsigned int getNumChildren() returns the number of children of this AST node or 0 is this node has no children.
void addChild(ASTNode* child) adds the given node as a child of this AST node. Child nodes are added in left-to-right order.
void prependChild(ASTNode* child) adds the given node as a child of this AST node. This method adds child nodes in right-to-left order.
ASTNode* getChild(unsigned int n) returns the nth child of this AST node or NULL if this node has no nth child [i.e., if n > (node->getNumChildren() - 1), where node is a pointer to a node].
ASTNode* getLeftChild() returns the left child of this AST node. This is equivalent to getChild(0).
ASTNode* getRightChild() returns the right child of this AST node or NULL if this node has no right child.
void swapChildren(ASTNode *that) swaps the children of this ASTNode with the children of that ASTNode.
void setType(ASTNodeType_t type) sets the type of this ASTNode to the given ASTNodeType_t enumeration value.

Reading and Writing MathML directly

: As mentioned above, applications often can avoid working with raw MathML by using either libSBML's text-string interface or the AST API. However, when needed, reading MathML content directly and creating ASTs is easily done in libSBML using a method designed for this purpose:

ASTNode_t* readMathMLFromString() reads raw MathML from a text string, constructs an AST from it, then returns the root ASTNode of the resulting expression tree.

Similarly, writing out Abstract Syntax Tree structures is easily done using the following method:

char* writeMathMLToString() writes an AST to a string. The caller owns the character string returned and should free it after it is no longer needed.

The example program given above demonstrate the use of these methods.

Table of Contents