Opened 7 years ago

Last modified 4 years ago

#4120 assigned defect

Out-of-bounds array elements undetected by front-end

Reported by: vitalij Owned by: perost
Priority: blocker Milestone: 2.0.0
Component: New Instantiation Version:
Keywords: Cc:

Description

model foo
  function f
    input Real[2] x;
    output Real[2] y;
    output Real[2] z;
  algorithm
    y[1] := 2*x[1]^2 + 1; y[2] := 3*x[2]^3 + 2;
    z[1] := -y[1]/2+1/2+7; z[2] := -y[2]/3+2/3-7;
    annotation(Inline=true);
  end f;

  Real[2] y(each start = -1);
  Real[2] w(each start = -1);
  Real z1, z2, z3;
equation
  (, w) = f({y[1] + z3, y[2]});
  (, w) = f({y[1] + w[1], y[2] + w[3]});
  der(y[1]) = -z1*y[2] +1;
  der(y[2]) = -z2*y[1] + z1*y[2] +3;
  z3 = time;
end foo;

where w[3] is undefined.

Change History (21)

comment:1 Changed 7 years ago by casella

  • Component changed from *unknown* to Frontend
  • Milestone changed from Future to 1.13.0
  • Owner changed from somebody to perost
  • Priority changed from high to critical
  • Status changed from new to assigned
  • Summary changed from Don't detect undefined variable to Out-of-bounds array elements undetected by front-end

Simpler test case:

model foo
  Real x[2];
equation
  x[1] + x[2] = 3;
  x[2] + x[3] = 4;
end foo;

Only breaks during C-code simulation:

foo_06inz.c: In function 'foo_eqFunction_1':
foo_06inz.c:21:63: error: '$Px$lB3$rB' undeclared (first use in this function)
   data->localData[0]->realVars[1] /* x[2] variable */ = 4.0 - $Px$lB3$rB;
                                                               ^

IMHO this kind of out-of-bound access to arrays should be detected statically by the front-end.

comment:2 Changed 5 years ago by casella

  • Component changed from Frontend to New Instantiation
  • Milestone changed from 1.13.0 to 2.0.0
  • Priority changed from critical to blocker

I just tried the test case of comment:1 with the NF. The model runs successfully and produces x[1] = -1 and x[2] = 4, which is consistent with considering x[3] = 0.

This is actually quite bad - an invalid model should produce an error, not some meaningless simulation results.

comment:3 Changed 4 years ago by casella

@perost, I stumbled again in this error today - it is really quite embarrassing.

Any chance to get it fixed for 1.16.0?

Thanks!

comment:4 follow-up: Changed 4 years ago by phannebohm

This might become non-trivial if you consider for loops:

model M
  Real x[9000];
  parameter Integer k = 8000 "crazy phase transition";
equation
  for i in 1:k loop
    x[i] = sin(i*time);
  end for;
  for i in k:9000 loop
    x[i] - x[i+1] = 42;
  end for;
end M;

Notice the out-of-bounds access in the last equation of second for loop. It's over 9000!
Would be nice to find that symbolically.

comment:5 in reply to: ↑ 4 Changed 4 years ago by Karim.Abdelhak

Replying to phannebohm:

Notice the out-of-bounds access in the last equation of second for loop. It's over 9000!
Would be nice to find that symbolically.

Nice hidden joke :X. Nevertheless i guess one could just check max(start, stop) inserted symbolically for the iterator overflow and that should be alright. If there is some kind of expression inside one could use the evaluateConstants module (or however it is called) to reduce it. Afaik it has to be constant.

comment:6 Changed 4 years ago by perost

I've implemented a check for out of bounds subscripts in PR #850. I added it to the VerifyModel module that runs at the very end of the frontend, just before converting to DAE format, so that it's done after all the evaluation and simplification.

comment:7 follow-up: Changed 4 years ago by casella

Is this only run whith -nfScalarize?

comment:8 in reply to: ↑ 7 ; follow-up: Changed 4 years ago by perost

Replying to casella:

Is this only run whith -nfScalarize?

No, it's always run, it will just check any subscripts that have been fully evaluated in the flat model. But it won't work quite as well with scalarization turned off since we don't have any fancy analysis of e.g. for loops.

comment:9 in reply to: ↑ 8 Changed 4 years ago by Karim.Abdelhak

Replying to perost:

Replying to casella:

Is this only run whith -nfScalarize?

No, it's always run, it will just check any subscripts that have been fully evaluated in the flat model. But it won't work quite as well with scalarization turned off since we don't have any fancy analysis of e.g. for loops.

See comment:4

If you replace iterators with max(start, stop) and you evaluate each of those expression with some evaluate constant function (which i guess has to exist?) you should get the maximum value the iterator can get and check that with the array boundaries.

I don't know if it is worth going for this rn, but for future reference this will be helpful!

We talked about this once: we want to have model checking for the new frontend some time in the future, for that it is neccessary!

comment:10 Changed 4 years ago by casella

I'll leave this tentatively open, just in case we can implement some symbolic checks. We may as well decide that in case of fancy for loops which are not unrolled, it will be up to the backend to figure out that there are out-of-bounds errors.

I hope PR 850 at least fixes the situation with the current back-end, where for loops with out-of-bound accesses simply generate meaningless results in simulation (which is quite bad for such a high-level language as Modelica).

comment:11 follow-up: Changed 4 years ago by phannebohm

Can I say I'm amazed at how promptly PR #850 came?

From what I see, -nfScalarize is still wip anyways and I think I can think of arbitrarily convoluted for loops. Take the model of comment:4: the parameter k needs to be what I think is called a structural parameter if the loops are unrolled, meaning its value can't be changed after compiling the model. But with -nfScalarize this could, in theory, be modified, right? In that case out-of-bounds checks would need to be done at runtime?

In any case, I like @Karims "replace iterators with max and min(!) values and evaluate (if possible)". That should be a good start extending this.

comment:12 in reply to: ↑ 11 ; follow-up: Changed 4 years ago by perost

Replying to phannebohm:

In that case out-of-bounds checks would need to be done at runtime?

Out-of-bounds checks needs to be done at runtime regardless, catching it during compilation is not possible if the subscript is variable. But that's already done, so the main issue is out-of-bounds accesses that causes issues during compilation and gives bad error messages or behaviour.

The model in comment:1 shows that it's possible to have out-of-bounds accesses in the model which does not trigger the runtime checks though. In that case something goes wrong in the backend which replaces x[3] with x[1] in the generated code, so no actual out-of-bounds access occurs during simulation. That particular case is caught by the frontend now, but there might be other similar issues remaining.

comment:13 in reply to: ↑ 12 Changed 4 years ago by Karim.Abdelhak

Replying to perost:

Replying to phannebohm:

In that case out-of-bounds checks would need to be done at runtime?

Out-of-bounds checks needs to be done at runtime regardless, catching it during compilation is not possible if the subscript is variable.

Really? In an algorithm section that might be true, but that cannot be true for the equation section since we have to do causalization on that. You cannot have variable access to an array in an equation section.

The easiest model i can think of is this one:

model testArr
  Boolean x(start = false);
  Integer y[2];
  Real z[2];
equation  
  when sample(0, 0.1) then
    x = not pre(x);
  end when;
  y[1] = if x then 1 else 2;
  y[2] = if x then 2 else 1;
  z[y[1]] = sin(time);
  z[y[2]] = cos(time);
end testArr;

It has variable index based on something than is not known at compile time (it actually could be but we do not check for it. It could as well depend on a state). And it fails as expected...

All indices inside equations must be calculable during compile time otherwise matching/index reduction/sorting... that whole machinery would not work.

For algorithms that does not hold since they have strictly declared in and outputs.

comment:14 Changed 4 years ago by Karim.Abdelhak

Btw. the correct way to construct the model above would be something like:

model testArr
  Boolean x(start = false);
  Real aux[2];
  Real z[2];
equation  
  when sample(0, 0.1) then
    x = not pre(x);
  end when;
  aux[1] = sin(time);
  aux[2] = cos(time);
  z[1] = if x then aux[1] else aux[2];
  z[2] = if x then aux[2] else aux[1];
end testArr;

You can always avoid variable index access so it is no real restriction.

Last edited 4 years ago by Karim.Abdelhak (previous) (diff)

comment:15 follow-ups: Changed 4 years ago by perost

Replying to Karim.Abdelhak:

Really? In an algorithm section that might be true, but that cannot be true for the equation section since we have to do causalization on that. You cannot have variable access to an array in an equation section.

As far as I'm aware there's no such restriction in the Modelica specification. And for example this model compiles just fine:

model M
  Real x[3] = ones(3);
  Real y = x[integer(time)];
end M;

The simulation then terminates immediately if the start time is 0, since that causes an out-of-bounds access.

comment:16 in reply to: ↑ 15 Changed 4 years ago by phannebohm

Replying to Karim.Abdelhak:

All indices inside equations must be calculable during compile time otherwise matching/index reduction/sorting... that whole machinery would not work.

True, in that case forget my argument regarding k.

Replying to perost:

Replying to Karim.Abdelhak:

Really? In an algorithm section that might be true, but that cannot be true for the equation section since we have to do causalization on that. You cannot have variable access to an array in an equation section.

As far as I'm aware there's no such restriction in the Modelica specification. And for example this model compiles just fine:

model M
  Real x[3] = ones(3);
  Real y = x[integer(time)];
end M;

The simulation then terminates immediately if the start time is 0, since that causes an out-of-bounds access.

My guess is that x is practically constant so there is no ambiguity regarding causalization. I wonder how exactly this works.

But I would agree with Karim, if there is unknowable index access then the model cannot be causalized at compile time and it shouldn't work?

comment:17 in reply to: ↑ 15 Changed 4 years ago by Karim.Abdelhak

Replying to perost:

Replying to Karim.Abdelhak:

Really? In an algorithm section that might be true, but that cannot be true for the equation section since we have to do causalization on that. You cannot have variable access to an array in an equation section.

As far as I'm aware there's no such restriction in the Modelica specification. And for example this model compiles just fine:

model M
  Real x[3] = ones(3);
  Real y = x[integer(time)];
end M;

The simulation then terminates immediately if the start time is 0, since that causes an out-of-bounds access.

The modelica specification does indeed allow it as far as i can see. And in your special case the equation does not have to be solved for that indexed variable, that just works by accident. It however can never be solved by an unknown array element indexed with a variable index because we just don't know if it appears or not. This is still kind of an algorithm since it can only be solved for y.

The general thing i wanted to say is: it does not matter whether or not modelica spec allows it, we cannot solve for any variable that is indexed with a non constant index. And i do not know any method that can...

As long as we do not have to solve for it, it works i guess. And then we have to check during runtime... but i actually think this is bad modeling and should be avoided and actually not be allowed in modelica (outside of algorithms).

Once again i am amazed that this is allowed.

comment:18 follow-up: Changed 4 years ago by Karim.Abdelhak

Okay i took a cold shower and thought about it.

We actually can compile it... take the model from 13. If processed correctly we could reformulate equations 4 and 5 to be residual and work from there. Needs an adaption in the whole adjacency matrix module though.

I guess it can be allowed due to that...

Maybe we should make a ticket for this, can somebody check if dymola is able to compile it? I do not have a dymola installation at home.

comment:19 in reply to: ↑ 18 ; follow-up: Changed 4 years ago by perost

Replying to Karim.Abdelhak:

Maybe we should make a ticket for this, can somebody check if dymola is able to compile it? I do not have a dymola installation at home.

Dymola simulates the model in comment:13 without any issues as far as I can determine. And for my model in comment:15 it gives exactly the same behaviour as OMC, i.e. it goes to simulation without any errors and terminates due to the out-of-bounds access.

comment:20 in reply to: ↑ 19 Changed 4 years ago by Karim.Abdelhak

Replying to perost:

Replying to Karim.Abdelhak:

Maybe we should make a ticket for this, can somebody check if dymola is able to compile it? I do not have a dymola installation at home.

Dymola simulates the model in comment:13 without any issues as far as I can determine. And for my model in comment:15 it gives exactly the same behaviour as OMC, i.e. it goes to simulation without any errors and terminates due to the out-of-bounds access.

That is alarming. I guess they reform those equations to be residual and just go from there. I think i know how to fix it, might be difficult to implement though. I will open a ticket and get @Francesco into this conversation because he might have some intel if this is relevant for actual models.

comment:21 Changed 4 years ago by casella

Just a general comment from my side. Despite the original title of this ticket, it is clear that out-of-bounds access violations need not necessarily be caught be the frontend, in general.

What I observed was that today's compiler did not handle them properly at all, at least when using the new front end:

  • the frontend did not catch them (even when fully scalarizing the arrays, which is the default behaviour as of today and still for quite some time in the future)
  • the backend did not catch them either
  • C code was successfully generated and compiled
  • the simulation actually run and produced results, which were obviously completely bogus because they apparently relied on out-of-bound array access in hardware memory

I thought that this outcome was particularly dangerous, and my primary concern was to address this situation for 1.16.0. I understand PR 850 just does that.

I agree that in the future we may want to defer structural parameter evaluation as much as possible when dealing with arrays, possibly up to the point that we do not need to recompile a model when we change parameters that affect the number of variables in the problem. After all, the concept of "structural parameter" is not even mentioned by the Modelica specification, nor is the frontend/backend separation, not is the idea that intermediate code should be generated and compiled. Maybe we really have to think out of the box in this respect, and try out strongly innovative ways to generate simulation results from highly flexible vector-based Modelica code. In fact, one could even have interpreted Modelica tools that do not even generate C or LLVM code at all. AFAIK, this approach is followed by gPROMS, which is definitely not an academic toy program and can be quite effective in dealing with complex, large-scale systems.

I understand structural parametes were only introduced for convenience assuming that the front-end actually unrolls all loops and all arrays to scalars, to make the backend developers' life easier. This is somewhat required by the current formulation of the flattening process in the Modelica Specification, but I understand there is some consensus that this specification is meant to uniquely define the result of the process, but is by no means prescriptive of how this should actually be done. In fact, the future back-end will not require arrays and loops to be unrolled, which opens up a number of other exciting possibilities.

@Karim feel free to open new enhancement tickets where you discuss these aspects in view of the new vectorized backend

Note: See TracTickets for help on using tickets.