Version 5 (modified by 11 years ago) ( diff ) | ,
---|
Writing efficient MetaModelica code
In the bootstrapped compiler, together with the extended MetaModelica/Modelica things you can use there are some restrictions:
- avoid
matchcontinue
as much as possible and usematch
instead, combined with tail recursion if needed (functions where you expect to maybe iterate over more than 50 elements should be tail-recursive). - If possible, rewrite
match
-expressions to use switch instead (+d=patternmAllInfo shows which expressions are optimised to switch). A switch needs to have one pattern that can be uniquely switched on a uniontype, Integer, or String type. If the last pattern is a default pattern, this pattern has to be the only one that is matched against (there may exist more, but they may only be patterns that cannot fail to match). If the last pattern is a default pattern, all uniontypes matched against in previous cases must have subpatterns that cannot fail to match. If all these restrictions are fulfilled, the match-expression avoids linear search of the patterns. - Write tail recursive functions: if a function calls itself it should do that as last thing in the then part or in the last statement. You are required to bind all outputs in the same order as the function outputs or tail recursion does not work (so no wildcards ignoring one output).
matchcontinue
expressions can never be tail-recursive. - Inlining functions could be used to great effect, but it currently interferes with separate compilation so in practice it will not work at the moment.
- Use builtin functions whenever possible: they have implementations that are better than you can achieve using MetaModelica code (for example: stringAppendList and stringDelimitList only use a single memory allocation, list reduce,map, and filter using the built-in operator avoid the extra listReverse)
- Avoid using the construct
case x equation true = fn(x); then (); case x equation false = fn(x); then ();
. The bootstrapped compiler will not merge the two cases into one and algorithms that ran in linear time using RML might run in quadratic time using the bootstrapped compiler if you do this. It also precludes optimisations such as tail recursion because you usematchcontinue
instead ofmatch
. Usecase x guard fn(x) then ()
instead; it is possible to use this withmatch
. - Memory allocations are expensive. When optimizing the OMCC lexer generator it was possible to get speed similar to the ANTLR C version due to smarter algorithms avoiding memory allocations. That said, memory allocations are not incredibly expensive; use them when needed. But if you have the choice of sending arguments as a tuple or as 4 separate arguments, allocation of the tuple might be the lion's share of execution time depending on what the function does. Traversal routines can be rewritten for better performance if they do not create tuples all the time for example.
- Compiling efficiently depends a lot on the public interfaces of packages. If you do not intend for a function to be called from other packages, make it protected.
- Note that inlining calls across packages does not work due to the separate compilation scheme. Link-time optimizations in gcc/clang could remove function calls, but it probably will not.
- Note that tail recursion optimization is done by omc, not gcc. As such, -O0 and -O3 will generate the same function calls. But the stack usage is very different due to local optimizations within the function in C. If you use -O0 to debug something, you might trigger a stack overflow in a function that you thought was tail recursive but is actually not (-O3 just made the frames smaller so you could iterate over more elements). If this happens and you need -O0 to debug all variables, you need to first fix the function that triggers the stack overflow with -O0 or increase the stack size (simple on Linux; does not require re-compilation).
Note:
See TracWiki
for help on using the wiki.