Opened 8 years ago

Closed 7 years ago

#4453 closed defect (fixed)

Compilation Time of a model with large size arrays

Reported by: Christos Theodosiou Owned by: Martin Sjölund
Priority: high Milestone: 1.12.0
Component: Backend Version:
Keywords: Cc: Francesco Casella, Lennart Ochel

Description

We have tried the following items:
1) Making a function for large array equations is slow (Pac_OnlyFunction)
2) Making an External function(c) for large array equations is fast (Pac_External_C_function)

We seek a way to do everything in OpenModelica.

Attachments (6)

Pac_OnlyFunction.zip (6.0 KB ) - added by Christos Theodosiou 8 years ago.
Pac_External_C_function.zip (3.9 KB ) - added by Christos Theodosiou 8 years ago.
Execution_Compiling_time.xlsx (14.0 KB ) - added by Christos Theodosiou 8 years ago.
main.mos (335 bytes ) - added by Christos Theodosiou 8 years ago.
callgrind.out.14520 (3.5 MB ) - added by Christos Theodosiou 8 years ago.
callgraph.dot.ps (13.8 KB ) - added by Christos Theodosiou 8 years ago.

Change History (19)

by Christos Theodosiou, 8 years ago

Attachment: Pac_OnlyFunction.zip added

by Christos Theodosiou, 8 years ago

Attachment: Pac_External_C_function.zip added

by Christos Theodosiou, 8 years ago

by Christos Theodosiou, 8 years ago

Attachment: main.mos added

by Christos Theodosiou, 8 years ago

Attachment: callgrind.out.14520 added

by Christos Theodosiou, 8 years ago

Attachment: callgraph.dot.ps added

comment:1 by Christos Theodosiou, 8 years ago

Uploaded main.mos the test script.
Also uploaded valgrind profiling results.

It seems that:

  1. About 23% of the run time is consumed in valueEq (called ~3e8 times) which is actually a thin wrapper of valueCompare. The valueCompare recursively calls its self about 4.6e9 times.
  2. The gc garbage collector seems to consume ~38% of the run time

comment:2 by Martin Sjölund, 7 years ago

Almost 50% of translation time is generating the equations:

Notification: Performance of Templates: time 56.31/126.1, allocations: 7.939 GB / 37.29 GB, free: 369.3 MB / 2.196 GB

And the generated code is mostly:

  put_real_matrix_element(data->localData[0]->realVars[1482] /* initialXCoordinateValue_21._y[6,2] variable */, 5, 1, &tmp35);
  put_real_matrix_element(data->localData[0]->realVars[1483] /* initialXCoordinateValue_21._y[6,3] variable */, 5, 2, &tmp35);
  put_real_matrix_element(data->localData[0]->realVars[1484] /* initialXCoordinateValue_21._y[6,4] variable */, 5, 3, &tmp35);

Not expanding these arrays should give a big improvement in performance of the compiler. There is the compiler flag --vectorizationLimit=n, but setting it to 2 only improved performance by ~33% and didn't seem to change the end result much (the limit mostly affects the frontend and somewhere in the backend the variable seems to be expanded again).

comment:3 by Martin Sjölund, 7 years ago

In the backend, this expansion is done by removeSimpleEquations. The new experimental version of that module fails on this model...

  • -v=2 --removeSimpleEquations=none: 5.2s
  • -v=2 --removeSimpleEquations=default: 89.98s
  • -v=2 --removeSimpleEquations=causal: 47.85s
  • -v=2 --removeSimpleEquations=fastAcausal: 89.69s
  • -v=2 --removeSimpleEquations=allAcausal: 91.09s
  • -v=2 --removeSimpleEquations=new: fail

I guess it could be possible to detect the vectorization limit and stop the expansion from happening in removeSimpleEquations. But if you just want to run your models faster, use a small vectorization limit (like -v=2) and --removeSimpleEquations=none. This might cause some models to grow in size/execution time, but it will not expand this case and generates smaller code (~68MB to ~356 kB for the initialization problem).

comment:4 by Martin Sjölund, 7 years ago

Cc: Francesco Casella Lennart Ochel added
Owner: changed from somebody to Martin Sjölund
Status: newassigned

comment:5 by Martin Sjölund, 7 years ago

I also think the vectorization limit has a bug somewhere and it checks each dimension individually. If you set -v=36 on an 18x36 matrix, it vectorizes the entire thing. You have to set the limit lower or equal to the greatest dimension in order to stop the expansion; I would have assumed lower or equal than the product of all dimensions...

comment:6 by Francesco Casella, 7 years ago

The issue with removeSimpleEquations is well-known; ticket #3695 was opened one and a half years ago, but there's been no progress since then, and I understand Willi will have other priorities for many months to come.

Is anybody else out there willing to take over the completion of the new removeSimpleEquation module, possibly under Willi's directions? I'm not sure how much work is left, but maybe he or Bernhard can comment about that.

comment:7 by Martin Sjölund, 7 years ago

I found one place where I can fix the expansion of functions greater than some size in removeSimpleEquations and then I found a replacement from the original variable to an alias that seems wrong: it has something like:ch

x=y;

With replacements for x[1]...x[n] -> y[1]...y[n] which causes the entire thing to expand since there is no replacement rule for x -> y...

comment:8 by Martin Sjölund, 7 years ago

The problem with my fix to not expand equations is that it breaks other models that require the equations to be scalarized...

My fix to collapse equations like {x[1],x[2],x[3]} into x seems to work a bit better though. And without the above fix still improves translation time to 1/3 of the original.

I think what's needed is just detecting:

x = f(...); // Array equation

And simply not expanding it into:

x[1] = f(...)[1];
x[...] = f(...)[...];
x[n] = f(...)[n];

(Which is the later on treated with common subexpression elimination)

in reply to:  8 comment:9 by anonymous, 7 years ago

Replying to sjoelund.se:
Has you changes some impact of
#4129?

comment:10 by Martin Sjölund, 7 years ago

It might have an effect on #4129 in some cases. I can't push all my fixes for this because the code generator cannot handle everything (and I don't have time to fix it). For example, x[7].y has to be expanded to {x[7].y[1], ... because otherwise incorrect offsets are generated. Also, in #4129 the equations have been expanded and many multibody models seem to depend on this or unsolvable strong connected components are generated. I thought #4129 would sort of be resolved by CSE (resolveLoops) though.

I will take a look at #4129 when I am done with the things I can fix in this ticket though.

comment:11 by Martin Sjölund, 7 years ago

I might have to revert the PR due to the regression testing. It broke C++ runtime (fixable), but also... there is a problem with dummy states. {x[1], ..., x[n]} might be an array where x[1] is a dummy state and not stored in the same memory location as the other variables.

comment:12 by Martin Sjölund, 7 years ago

The thing with dummy states is handled by expanding the array again if necessary during codegen

comment:13 by Martin Sjölund, 7 years ago

Component: *unknown*Backend
Milestone: Future1.12.0
Resolution: fixed
Status: assignedclosed
Note: See TracTickets for help on using tickets.