wiki:WritingEfficientMetaModelica

Writing efficient, readable, and maintainable MetaModelica code

In the bootstrapped compiler, together with the extended MetaModelica/Modelica there are some simple rules to write efficient, readable, and maintainable code. The RML implementation is a lot more restrictive and is an obstacle especially when it comes to writing readable and maintainable code.

Use builtin functions and operators whenever possible

Builtin functions have implementations that are better than you can achieve using MetaModelica code (for example: stringAppendList and stringDelimitList only use a single memory allocation).

It is possible to use many constructs from the Modelica language, such as list comprehensions, partial function application, if-expressions, if-statements, for-loops and while-loops.

List reductions (like Modelica array reductions) can be used to avoid the extra listReverse at the end of functions written by the user. They are also easier to read. Example:

// List reduction, creates 1 list
list(2*x*y for x guard x>0 in lst);
// RML style, creates 4 lists including the listReverse operations
List.map1(List.filter(lst, isPositive), realMul, 2.0*y);

If you need to loop over a list, use a for- or while-loop rather than writing auxiliary tail-recursive functions (to save time writing and maintaining code):

for x in lst loop
  // ...
end for;

Use if-expressions or if-statements to conditionally do things in code that is very similar:

// RML-style
case PAT1()
  algorithm
    f(x);
    true = g(y);
    h(z);
    i(x);
  then ();
case PAT1()
  algorithm
    f(x);
    false = g(y);
    i(x);
  then ();
// OMC style
case PAT1()
  algorithm
    f(x);
    if g(y) then
      h(z);
    end if;
    i(x);
  then ();

Instead of being required to create new functions in order to pass higher-order functions in function calls, consider using partial function application (closures) instead:

List.exist(exps, function Expression.expEqual(e2=exp)); // Bootstrapping
List.exist1(exps, Expression.expEqual, exp); // RML; requires additional functions to be maintained in List.mo

try/catch

Use try/catch syntax for doing stuff like the following:

// RML-style
algorithm
  _ := matchcontinue()
    case () equation
      // Do something which might fail.
    then ();
    
    else equation
      // Do something else.
    then ();
  end matchcontinue;
// OMC style
algorithm
  try
    // Do something which might fail.
  else
    // Do something else.
  end try;

Try/else is implemented as syntactic sugar. It is executed as if it was the above matchcontinue statement, but is shorter to write.

Assign to metarecord fields directly

MetaModelica uniontype records are a bit different than Modelica records: they are immutable, so you can't just assign to them. Especially since uniontypes may have several records. It is possible to list all fields and create a new record. But it is also possible to match a uniontype to become a metarecord (sadly not in assignments yet, though). Using this, it is possible to use dot-notation to access a field (member) of the record, and also to assign to it (given that it is not a function input).

The following pattern seems to work well, and produces code that is easy to maintain (adding a field to PROGRAM will not require you to update this code). Assigning to two different fields will cause two allocations of the record though, so there is a cost of using this (for assigning to a single field, there is an optimized implementation that is faster than retrieving all the fields and calling the record constructor).

function removeInnerDiffFiledClasses
  input Absyn.Program inProgram;
  output Absyn.Program p := inProgram;
algorithm
  p := match p
    case Absyn.PROGRAM()
      equation
        p.classes = List.map(p.classes, removeInnerDiffFiledClass);
      then p;
  end match;
end removeInnerDiffFiledClasses;

Avoid matchcontinue, use tail recursion

matchcontinue is slow. Whenever possible, use match instead. This avoids calling setjmp.

Avoid using the construct case x equation true = fn(x); then (); case x equation false = fn(x); then (); since this requires matchcontinue. The bootstrapped compiler will not merge the two cases into one and algorithms that ran in linear time using RML might run in quadratic time using the bootstrapped compiler if you do this. It also precludes optimisations such as tail recursion because you use matchcontinue instead of match. Use case x guard fn(x) then () instead; it is possible to use this with match.

Also, write tail recursive functions: if a function calls itself it should do that as last thing in the then part or in the last statement. You are required to bind all outputs in the same order as the function outputs or tail recursion does not work (no wildcards ignoring one output). matchcontinue expressions can never be tail-recursive.

Switch optimization

If possible, rewrite match-expressions to use switch instead (+d=patternmAllInfo shows which expressions are optimised to switch). A switch needs to have one pattern that can be uniquely switched on a uniontype, Integer, or String type. If the last pattern is a default pattern, this pattern has to be the only one that is matched against (there may exist more, but they may only be patterns that cannot fail to match). If the last pattern is a default pattern, all uniontypes matched against in previous cases must have subpatterns that cannot fail to match. If all these restrictions are fulfilled, the match-expression avoids linear search of the patterns.

Inlining functions

Inlining functions could be used to great effect, but it currently interferes with separate compilation so in practice it will not work at the moment. Inlining calls across packages does not work due to the separate compilation scheme. Link-time optimizations in gcc/clang could remove function calls, but it probably will not.

Avoid memory allocations

Memory allocations are expensive. When optimizing the OMCC lexer generator it was possible to get speed similar to the ANTLR C version due to smarter algorithms avoiding memory allocations. That said, memory allocations are not incredibly expensive; use them when needed. But if you have the choice of sending arguments as a tuple or as 4 separate arguments, allocation of the tuple might be the lion's share of execution time depending on what the function does.

Traversal routines can be rewritten for better performance if they do not create tuples all the time for example.

Think about build system performance

Compiling efficiently depends a lot on the public interfaces of packages. If you do not intend for a function to be called from other packages, make it protected.

Tail recursion

Note that tail recursion optimization is done by omc, not gcc. As such, -O0 and -O3 will generate the same function calls. But the stack usage is very different due to local optimizations within the function in C. If you use -O0 to debug something, you might trigger a stack overflow in a function that you thought was tail recursive but is actually not (-O3 just made the frames smaller so you could iterate over more elements). If this happens and you need -O0 to debug all variables, you need to first fix the function that triggers the stack overflow with -O0 or increase the stack size (simple on Linux; does not require re-compilation).

Last modified 10 years ago Last modified on 2014-11-03T11:07:24+01:00