Opened 8 years ago

Last modified 5 years ago

#3844 assigned defect

Unnecessary array instantiations

Reported by: rfranke Owned by: adrpo
Priority: high Milestone: 2.0.0
Component: Frontend Version:
Keywords: Cc: niklwors, casella

Description (last modified by rfranke)

Examples that use arrays often result in huge code because OpenModelica instantiates these arrays over and over again. Have for instance a look the Modelica.Media.Examples.R134a.* examples.

Here is a simpler version:

package ArrayData
  constant Integer[:] cdata = {1, 2, 3, 4, 5, 6, 7, 8, 9};

  function f
    input Integer[:] idata;
    input Integer idx;
    output Integer y;
  algorithm
    y := idata[idx] + cdata[idx] + cdata[idx + 1];
  end f;
  
  model Test
    input Integer idx = 0;
    parameter Integer[:] pdata = cdata;
    Integer y = f(pdata, idx+1);
  end Test;
end ArrayData;

ArrayData.Test is instantiated to:

function ArrayData.f
  input Integer[:] idata;
  input Integer idx;
  output Integer y;
algorithm
  y := idata[idx] + {1, 2, 3, 4, 5, 6, 7, 8, 9}[idx] + {1, 2, 3, 4, 5, 6, 7, 8, 9}[1 + idx];
end ArrayData.f;

class ArrayData.Test
  input Integer idx = 0;
  parameter Integer pdata[1] = 1;
  parameter Integer pdata[2] = 2;
  parameter Integer pdata[3] = 3;
  parameter Integer pdata[4] = 4;
  parameter Integer pdata[5] = 5;
  parameter Integer pdata[6] = 6;
  parameter Integer pdata[7] = 7;
  parameter Integer pdata[8] = 8;
  parameter Integer pdata[9] = 9;
  Integer y = ArrayData.f({pdata[1], pdata[2], pdata[3], pdata[4], pdata[5], pdata[6], pdata[7], pdata[8], pdata[9]}, 1 + idx);
end ArrayData.Test;

It is bad if a temporary array is created when calling the function f, instead of just passing pdata. It is worse if the whole cdata array is created temporarily again and again whenever the function f wants to access one element of it.

Attachments (2)

Change History (22)

comment:1 Changed 8 years ago by rfranke

  • Description modified (diff)

comment:2 Changed 8 years ago by rfranke

  • Description modified (diff)

comment:3 Changed 8 years ago by sjoelund.se

I think instantiated is the wrong word here. The problem is pdata is not a constant but is expanded. Constant arrays are generated as a constant data segment and are handled very efficiently. Individual elements of parameter arrays can be changed in the init.xml-file. If this parameter was evaluated in the backend, it could result in good code being generated.

comment:4 Changed 8 years ago by adrpo

  • Owner changed from somebody to adrpo
  • Status changed from new to accepted

I'll try to disable vectorization of arrays sent to functions which accept arrays and see if we don't break anything.

comment:5 Changed 8 years ago by sjoelund.se

The actual C-code generated by OMC:

_y = (*integer_array_element_addr1(&_idata, 1, /* modelica_integer */ (modelica_integer)_idx)) + integer_get(_OMC_LIT0, ((modelica_integer)_idx)-1) + integer_get(_OMC_LIT0, (((modelica_integer) 1) + (modelica_integer)_idx)-1)

comment:6 Changed 8 years ago by adrpo

The problem here is mostly with package constants, we need to constant evaluate them because in the current front-end the packages are gone after the instantiation and you cannot lookup that constant again. This especially applies for packages + redeclares as the unique scope is generated on the fly and then discarded after the DAE is generated.

comment:7 Changed 8 years ago by rfranke

  • Cc niklwors added

We might need global variables if packages are gone. The other issue is the creation of a temporary array for all elements instead of just passing pdata -- this can hopefully be eliminated.

Last edited 8 years ago by rfranke (previous) (diff)

comment:8 Changed 8 years ago by adrpo

@rfranke: either global variables or we keep the packages (or at least part of them) in the DAE.
I've started looking into the temporary array creation, hopefully there is an easy fix for that.

Changed 8 years ago by rfranke

comment:9 Changed 8 years ago by rfranke

  • Cc casella added

@sjoelund.se: I'm just trying to make things easier to grasp when creating small examples. This doesn't mean that everything is ok if one of the code lines generated by OpenModelica for the small example looks good. Find for your confenience attached the flat model and the _functions.c for Modelica.Media.Examples.R134a.R134a2. OpenModelica generates 15.699 KB resp. 8.859 KB of temporary array (re-)creations.

Dymola fits the same model into 763 KB flat Modelica and 752 KB C code.

comment:10 Changed 8 years ago by sjoelund.se

  • Owner changed from adrpo to sjoelund.se

The size of the flat Modelica does not matter if the arrays internally point to the same data structure. But you are right the R134a code looks bad, so your example model does not show the actual problem. It should be simple enough to fix code generation for this. Let me see what I can do. The problem there might be my detection of constant literals being broken. Or that some data in the array is weird.

comment:11 Changed 8 years ago by sjoelund.se

OK, the bindings for (at least some of) the mystical bindings do not show up in the flat Modelica: they are record default bindings and seem to not be caught when traversing the DAE for constant literals.

comment:12 Changed 8 years ago by rfranke

One should solve the problem at the source. Why first creating 16 MB of flat Modelica and then traversing for its removal?

comment:13 Changed 8 years ago by sjoelund.se

We would anyway need to traverse the structure for their removal because quite often the same constant appears from different sources. This is especially true for Media where you redeclare things so that the same constant will even have different names.

comment:14 Changed 8 years ago by rfranke

The source Modelica code contains one large array of data. If it is accessed using different names, then these are just crefs referring to each other. The frontend should not multiply the same array data again and again.

Moreover, as said earlier: a big fraction of the problem is the creating of temporary arrays for function calls, no matter if they are constant or have higher variability. This problem also shows up in R134a2 -- see e.g. calls to FindInterval.

comment:15 Changed 8 years ago by sjoelund.se

https://github.com/OpenModelica/OMCompiler/pull/712 is the first (small) part of this.

And no, constants would not magically refer to each other when part of redeclared or inherited packages if we did not evaluate them.

comment:16 Changed 8 years ago by sjoelund.se

The other part I believe can be solved by either:

  1. ExpressionSimplify {const_2d_array}[const_subscript] -> {const_1d_array}
  2. Not expanding the input array to the function call

comment:17 Changed 8 years ago by sjoelund.se

Sorry, not a constant subscript, but rather:

T_coef[int, :]

Which we expand into:

{T_coef[int,1],T_coef[int,2],T_coef[int,3],T_coef[int,4]}

It can be made faster also by not stopping to look for constant literals when we fail to replace an array. {T_coef[int,1],T_coef[int,2],T_coef[int,3],T_coef[int,4]} does contain constant inner arrays...

comment:18 Changed 8 years ago by sjoelund.se

With https://github.com/OpenModelica/OMCompiler/pull/713, functions.c is 439k, literals.h is 244k and the total model is 1.3MB of c/h-sources (1.6MB incl. xml/json). But PR 713 might break something (and the code is still not optimal).

comment:19 Changed 8 years ago by sjoelund.se

  • Owner changed from sjoelund.se to adrpo
  • Status changed from accepted to assigned

Reassigning to @adrpo for the remaining parts. (PR 713 worked fine, but was a bit slow so I improved it slightly)

comment:20 Changed 5 years ago by casella

It's a bit unclear to me what remains to be done here, and if it's a (new) front-end issue or a back-end issue.

@rfranke, can you please comment on that, or close the ticket if you are happy with the current status?

Note: See TracTickets for help on using tickets.