Opened 10 years ago

Closed 10 years ago

#2840 closed defect (fixed)

Segmentation fault if assertion fails in FMU

Reported by: Rüdiger Franke Owned by: Willi Braun
Priority: high Milestone: 1.9.1
Component: FMI Version: trunk
Keywords: Cc: Willi Braun, Adrian Pop

Description

The following model contains an assertion that fails during the simulation with a constant input u = -2 after 0.1 seconds.

model DIC "Double Integrator Continuous-time"
  parameter Real p = 1 "gain for input";
  parameter Real y1_start = 1 "start value for first state";
  parameter Real y2_start = 0 "start value for second state";
  input Real u(start = -2);
  output Real y1, y2;
initial equation
  y1 = y1_start;
  y2 = y2_start;
equation
  assert(y2 < 0.1, "y2 must be smaller than 0.1");
  der(y1) = p * u;
  der(y2) = y1;
end DIC;

Exported as FMU 2.0 with the omc command:

translateModelFMU(DIC, version="2.0");

the assertion causes:

assert            | debug   | y2 must be smaller than 0.1
Segmentation fault

This is bad if a numerical solver happens to exceed the allowed limits and has no chance to correct, e.g. with a smaller step size.

Shouldn't the respective FMU function just issue a log message and return an error?

Is there possibly a translation option to completely disable assertions, like the compilation of a C program without DEBUG flag?

I checked the documentation, but didn't get an answer there: OpenModelicaUsersGuide.pdf doesn't mention options for translateModelFMU and the scripting documentation at

https://build.openmodelica.org/Documentation/OpenModelica.Scripting.html

doesn't mention translateModelFMU at all :-(

Change History (13)

comment:1 by Adeel Asghar, 10 years ago

Cc: Willi Braun added

comment:2 by Martin Sjölund, 10 years ago

Adeel/Willi: Do you call MMC_TRY in all fmu entrypoints that evaluate expressions? If you don't, the setjmp buffer will point to a bad location and you will segfault.

comment:3 by Willi Braun, 10 years ago

Owner: changed from Adeel Asghar to Willi Braun
Status: newaccepted

As far as I see we don't handle assert at all.

comment:4 by Willi Braun, 10 years ago

Resolution: fixed
Status: acceptedclosed

fixed in r22801.

comment:5 by Rüdiger Franke, 10 years ago

Resolution: fixed
Status: closedreopened

The fix appears to corrupt the calling stack. Two problems become visible:

  1. The correct version of the above model (without the assertion) does not simulate anymore. It hangs in an infinite loop when events shall be updated immediately after initialization -- the FMU never seems to reset eventInfo.newDiscreteStatesNeeded -- cf. the pseudocode from the FMI spec:
    // event iteration
    eventInfo.newDiscreteStatesNeeded = true;
    while eventInfo.newDiscreteStatesNeeded loop
      // update discrete states
      M_fmi2NewDiscreteStates(m, &eventInfo)
      if eventInfo.terminateSimulation then goto TERMINATE_MODEL
    end while
    

The model works again when removing the new calls to MMC_TRY_INTERNAL and MMC_CATCH_INTERNAL from fmu2_model_interface.c.

  1. An outer longjmp by the calling environment crashes after the FMU has caught an internal error. Can it be that the internal error handlung corrupts the calling stack? See example sj3.c at http://web.eecs.utk.edu/~huangj/cs360/360/notes/Setjmp/lecture.html

comment:6 by Willi Braun, 10 years ago

Status: reopenedaccepted

comment:7 by Rüdiger Franke, 10 years ago

Running in gdb, the endless loop disappears as well. Instead the execution crashes with:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff366dd47 in GC_find_limit_with_bound () from /usr/lib/omc/libgc.so.1

from

fmu2_model_interface.c:302
    DATA* fmudata = (DATA *)GC_malloc_uncollectable(sizeof(DATA));

Attempting to allocate fmudata with

    DATA* fmudata = (DATA *)functions->allocateMemory(1, sizeof(DATA));

the SIGSEGV happens later:

0  in GC_find_limit_with_bound of /usr/lib/omc/libgc.so.1
1  in GC_init_linux_data_start of /usr/lib/omc/libgc.so.1
2  in GC_init of /usr/lib/omc/libgc.so.1
3  in GC_generic_malloc_inner of /usr/lib/omc/libgc.so.1
4  in GC_generic_malloc of /usr/lib/omc/libgc.so.1
5  in initializeDataStruc of /usr/lib/omc/libSimulationRuntimeC.so
6  in fmi2Instantiate of /usr/include/omc/c/fmi2/fmu2_model_interface.c:337

Is this GC stuff, combined with FMI allocateMemory, the root cause of the problem?

comment:8 by Adeel Asghar, 10 years ago

Cc: Adrian Pop added

comment:9 by Willi Braun, 10 years ago

As far as I see adrpo has fixed that in r23750 und r23751.
rfranke can you conform that?

comment:10 by Rüdiger Franke, 10 years ago

Hmmm... upgraded to r23777. The behavior is the same as before: endless event loop after initialization from console; crash in gdb when allocating DATA* fmudata. The stack frame at the crash is:

0  in GC_find_limit_with_bound of /usr/lib/omc/libgc.so.1
1  in GC_init_linux_data_start of /usr/lib/omc/libgc.so.1
2  in GC_init of /usr/lib/omc/libgc.so.1
3  in GC_generic_malloc_inner of /usr/lib/omc/libgc.so.1
4  in GC_generic_malloc of /usr/lib/omc/libgc.so.1
5  in fmi2Instantiate of /usr/include/omc/c/fmi2/fmu2_model_interface.c:302
6  in mdlInitializeConditions of ../hxi/sfun_fmu.c:820

comment:11 by Willi Braun, 10 years ago

Now I fixed the endless event loop in r23803, somehow I was initially on a wrong trace, because of the segmentation faults, but actually it was simple logical issue.

comment:12 by Rüdiger Franke, 10 years ago

Not only the endless loop, but also the corrupted calling stack disappeared in r23815. Thanks!
Applying your fix in the older r23777, both problems disappeared as well!?

Can it be that my SIGSEGV in gdb as soon as someone calls GC_generic_malloc is an unrelated problem?

comment:13 by Willi Braun, 10 years ago

Resolution: fixed
Status: acceptedclosed

My guess is the endless loop produces an assert in the call of M_fmi2NewDiscreteStates, this cause that some memory was freed and this cause the corrupted call stack. Something like that happened here.

Note: See TracTickets for help on using tickets.