Opened 7 years ago

Closed 7 years ago

#4453 closed defect (fixed)

Compilation Time of a model with large size arrays

Reported by: christos Owned by: sjoelund.se
Priority: high Milestone: 1.12.0
Component: Backend Version:
Keywords: Cc: casella, lochel

Description

We have tried the following items:
1) Making a function for large array equations is slow (Pac_OnlyFunction)
2) Making an External function(c) for large array equations is fast (Pac_External_C_function)

We seek a way to do everything in OpenModelica.

Attachments (6)

Pac_OnlyFunction.zip (6.0 KB) - added by christos 7 years ago.
Pac_External_C_function.zip (3.9 KB) - added by christos 7 years ago.
Execution_Compiling_time.xlsx (14.0 KB) - added by christos 7 years ago.
main.mos (335 bytes) - added by christos 7 years ago.
callgrind.out.14520 (3.5 MB) - added by christos 7 years ago.
callgraph.dot.ps (13.8 KB) - added by christos 7 years ago.

Change History (19)

Changed 7 years ago by christos

Changed 7 years ago by christos

Changed 7 years ago by christos

Changed 7 years ago by christos

Changed 7 years ago by christos

Changed 7 years ago by christos

comment:1 Changed 7 years ago by christos

Uploaded main.mos the test script.
Also uploaded valgrind profiling results.

It seems that:

  1. About 23% of the run time is consumed in valueEq (called ~3e8 times) which is actually a thin wrapper of valueCompare. The valueCompare recursively calls its self about 4.6e9 times.
  2. The gc garbage collector seems to consume ~38% of the run time

comment:2 Changed 7 years ago by sjoelund.se

Almost 50% of translation time is generating the equations:

Notification: Performance of Templates: time 56.31/126.1, allocations: 7.939 GB / 37.29 GB, free: 369.3 MB / 2.196 GB

And the generated code is mostly:

  put_real_matrix_element(data->localData[0]->realVars[1482] /* initialXCoordinateValue_21._y[6,2] variable */, 5, 1, &tmp35);
  put_real_matrix_element(data->localData[0]->realVars[1483] /* initialXCoordinateValue_21._y[6,3] variable */, 5, 2, &tmp35);
  put_real_matrix_element(data->localData[0]->realVars[1484] /* initialXCoordinateValue_21._y[6,4] variable */, 5, 3, &tmp35);

Not expanding these arrays should give a big improvement in performance of the compiler. There is the compiler flag --vectorizationLimit=n, but setting it to 2 only improved performance by ~33% and didn't seem to change the end result much (the limit mostly affects the frontend and somewhere in the backend the variable seems to be expanded again).

comment:3 Changed 7 years ago by sjoelund.se

In the backend, this expansion is done by removeSimpleEquations. The new experimental version of that module fails on this model...

  • -v=2 --removeSimpleEquations=none: 5.2s
  • -v=2 --removeSimpleEquations=default: 89.98s
  • -v=2 --removeSimpleEquations=causal: 47.85s
  • -v=2 --removeSimpleEquations=fastAcausal: 89.69s
  • -v=2 --removeSimpleEquations=allAcausal: 91.09s
  • -v=2 --removeSimpleEquations=new: fail

I guess it could be possible to detect the vectorization limit and stop the expansion from happening in removeSimpleEquations. But if you just want to run your models faster, use a small vectorization limit (like -v=2) and --removeSimpleEquations=none. This might cause some models to grow in size/execution time, but it will not expand this case and generates smaller code (~68MB to ~356 kB for the initialization problem).

comment:4 Changed 7 years ago by sjoelund.se

  • Cc casella lochel added
  • Owner changed from somebody to sjoelund.se
  • Status changed from new to assigned

comment:5 Changed 7 years ago by sjoelund.se

I also think the vectorization limit has a bug somewhere and it checks each dimension individually. If you set -v=36 on an 18x36 matrix, it vectorizes the entire thing. You have to set the limit lower or equal to the greatest dimension in order to stop the expansion; I would have assumed lower or equal than the product of all dimensions...

comment:6 Changed 7 years ago by casella

The issue with removeSimpleEquations is well-known; ticket #3695 was opened one and a half years ago, but there's been no progress since then, and I understand Willi will have other priorities for many months to come.

Is anybody else out there willing to take over the completion of the new removeSimpleEquation module, possibly under Willi's directions? I'm not sure how much work is left, but maybe he or Bernhard can comment about that.

comment:7 Changed 7 years ago by sjoelund.se

I found one place where I can fix the expansion of functions greater than some size in removeSimpleEquations and then I found a replacement from the original variable to an alias that seems wrong: it has something like:ch

x=y;

With replacements for x[1]...x[n] -> y[1]...y[n] which causes the entire thing to expand since there is no replacement rule for x -> y...

comment:8 follow-up: Changed 7 years ago by sjoelund.se

The problem with my fix to not expand equations is that it breaks other models that require the equations to be scalarized...

My fix to collapse equations like {x[1],x[2],x[3]} into x seems to work a bit better though. And without the above fix still improves translation time to 1/3 of the original.

I think what's needed is just detecting:

x = f(...); // Array equation

And simply not expanding it into:

x[1] = f(...)[1];
x[...] = f(...)[...];
x[n] = f(...)[n];

(Which is the later on treated with common subexpression elimination)

comment:9 in reply to: ↑ 8 Changed 7 years ago by anonymous

Replying to sjoelund.se:
Has you changes some impact of
#4129?

comment:10 Changed 7 years ago by sjoelund.se

It might have an effect on #4129 in some cases. I can't push all my fixes for this because the code generator cannot handle everything (and I don't have time to fix it). For example, x[7].y has to be expanded to {x[7].y[1], ... because otherwise incorrect offsets are generated. Also, in #4129 the equations have been expanded and many multibody models seem to depend on this or unsolvable strong connected components are generated. I thought #4129 would sort of be resolved by CSE (resolveLoops) though.

I will take a look at #4129 when I am done with the things I can fix in this ticket though.

comment:11 Changed 7 years ago by sjoelund.se

I might have to revert the PR due to the regression testing. It broke C++ runtime (fixable), but also... there is a problem with dummy states. {x[1], ..., x[n]} might be an array where x[1] is a dummy state and not stored in the same memory location as the other variables.

comment:12 Changed 7 years ago by sjoelund.se

The thing with dummy states is handled by expanding the array again if necessary during codegen

comment:13 Changed 7 years ago by sjoelund.se

  • Component changed from *unknown* to Backend
  • Milestone changed from Future to 1.12.0
  • Resolution set to fixed
  • Status changed from assigned to closed
Note: See TracTickets for help on using tickets.