Changes between Version 4 and Version 5 of WritingEfficientMetaModelica


Ignore:
Timestamp:
2014-08-26T07:07:56Z (11 years ago)
Author:
Martin Sjölund
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • WritingEfficientMetaModelica

    v4 v5  
    881. Use builtin functions whenever possible: they have implementations that are better than you can achieve using MetaModelica code (for example: stringAppendList and stringDelimitList only use a single memory allocation, list reduce,map, and filter using the built-in operator avoid the extra listReverse)
    991. Avoid using the construct {{{case x equation true = fn(x); then (); case x equation false = fn(x); then ();}}}. The bootstrapped compiler will not merge the two cases into one and algorithms that ran in linear time using RML might run in quadratic time using the bootstrapped compiler if you do this. It also precludes optimisations such as tail recursion because you use {{{matchcontinue}}} instead of {{{match}}}. Use {{{case x guard fn(x) then ()}}} instead; it is possible to use this with {{{match}}}.
     101. Memory allocations are expensive. When optimizing the OMCC lexer generator it was possible to get speed similar to the ANTLR C version due to smarter algorithms avoiding memory allocations. That said, memory allocations are not incredibly expensive; use them when needed. But if you have the choice of sending arguments as a tuple or as 4 separate arguments, allocation of the tuple might be the lion's share of execution time depending on what the function does. Traversal routines can be rewritten for better performance if they do not create tuples all the time for example.
     111. Compiling efficiently depends a lot on the public interfaces of packages. If you do not intend for a function to be called from other packages, make it protected.
     121. Note that inlining calls across packages does not work due to the separate compilation scheme. Link-time optimizations in gcc/clang could remove function calls, but it probably will not.
     131. Note that tail recursion optimization is done by omc, not gcc. As such, -O0 and -O3 will generate the same function calls. But the stack usage is very different due to local optimizations within the function in C. If you use -O0 to debug something, you might trigger a stack overflow in a function that you thought was tail recursive but is actually not (-O3 just made the frames smaller so you could iterate over more elements). If this happens and you need -O0 to debug all variables, you need to first fix the function that triggers the stack overflow with -O0 or increase the stack size (simple on Linux; does not require re-compilation).