Opened 10 years ago

Last modified 7 years ago

#3067 accepted defect

Memory leak in FMI 2.0 (20 MB/min)

Reported by: rfranke Owned by: adrpo
Priority: high Milestone: Future
Component: FMI Version: trunk
Keywords: Cc:

Description

Thanks to the solution of a couple of tickets during the last weeks the optimization basing on FMI is working quite well now!

A remaining problem is a memory leak that can be seen in all examples to some extend. Here is an extreme case that leaks about 20 MB/min (depending on the speed of your processor).

The problem can be reproduced by repeatedly running an optimization with HQP. On a machine with omc (r23977), git, gcc and tcl-dev:

$ git clone https://github.com/omuses/hqp.git
$ cd hqp
$ ./configure
$ make
$ cd odc
$ ./odc
% while true {source drumboiler_sp.tcl}

Attachments (2)

valgrind-trace.txt (939.7 KB) - added by adrpo 10 years ago.
valgrind-trace-once.txt (2.2 MB) - added by adrpo 10 years ago.
running valgrind on odc running source drumboiler_sp.tcl just once and then exit

Change History (41)

comment:1 Changed 10 years ago by rfranke

Let me add some more infos. It appears that memory leaks if fmi2Reset or fmi2FreeInstance are called.

Here is a minimal example that just instantiates the model and evaluates it at t=0. Moreover all FMI calls are logged:

$ ./odc
% while true {puts "--- new run ---"; prg_name DynamicOpt; mdl_path DrumBoiler.fmu; mdl_logging 4; prg_setup; mdl_u0 {0 1}; prg_simulate}

This example leaks about 200 MB/minute. Ten times more than the optimization run above because the model is reset more frequently. The leak can be reduced to 100 MB/minute if the call to fmi2Reset gets replaced with a complete new instantiation, i.e. fmi2FreeInstance followed by fmi2Instantiate (see hqp/hxi/sfun_fmu.c, line 827ff -- https://github.com/omuses/hqp/blob/8b83fce8da213131c0ce43f3a0c91d28c82dd6d6/hxi/sfun_fmu.c#L827).

comment:2 Changed 10 years ago by rfranke

Yet another observation: the much simpler double integrator model leaks significantly more memory if fmi2Reset is replaced with fmi2FreeInstance/fmi2Instantiate:

$ ./odc
% while true {source dic_fmu_est.tcl}

leaks 0.5 MB/min with fmi2Reset vs. 100 MB/min with fmi2FreeInstance/fmi2Instantiate.

comment:3 Changed 10 years ago by rfranke

The current head revision of HQP does not call fmi2Terminate anymore. See:
https://github.com/omuses/hqp/blob/2177ab5739220f70a74cf2604233ea95796b7498/hxi/sfun_fmu.c#L1184

This means that no memory is freed / re-allocated by the subsequent fmi2Reset (see fmu2_model_interface.c, L490ff):

  if (comp->state & modelTerminated) {
    /* intialize modelData */
    fmu2_model_interface_setupDataStruc(comp->fmuData);
    initializeDataStruc(comp->fmuData);
  }

This reduces the memory leak from 200 to 25 MB/min for the minimal DrumBoiler example given above -- meaning that fmi2Terminate and deInitializeDataStruc called from there leave a lot of unused memory allocated.

The remaining 25 MB/min are almost the same as for the drumboiler_sp optimization. This should mean that the memory is lost during model evaluations (like local data of Media function calls or data for equation systems?).

The good news is that a minimal example that just evaluated the trivial drum boiler example repeatedly does not leak memory anymore if fmi2Terminate is not called:

while true {puts "--- new run ---"; prg_name DynamicOpt; mdl_path DIC.fmu; mdl_logging 4; prg_setup; prg_simulate}

comment:4 Changed 10 years ago by rfranke

The last statement shall say "trivial double integrator". To summarize correctly:

DIC leaks no memory if fmi2Terminate is not called:

$ ./odc
% while true {puts "--- new run ---"; prg_name DynamicOpt; mdl_path DIC.fmu; mdl_logging 4; prg_setup; prg_simulate}

This points to a leak in the OpenModelica runtime between deInitializeDataStruc called from fmi2Terminate and deInitializeDataStruc called from fmi2Reset.

DrumBoiler featuring Fluid and Media still leaks 25 MB/min, even though fmi2Terminate is not called. This leak appears to be proportional to the number of model evaluations.

$ ./odc
% while true {puts "--- new run ---"; prg_name DynamicOpt; mdl_path DrumBoiler.fmu; mdl_logging 4; prg_setup; mdl_u0 {0 1}; prg_simulate}

comment:5 Changed 10 years ago by rfranke

The memory leak still persists. The FMU and the linked SimulationRuntimeC appear to use the Boehm garbage collector library. This is dangerous if linked into code that uses a similar garbage collector library, see http://www.hboehm.info/gc/#users. And it is not according to the FMI specification.

The FMI specification requires that FMUs call allocate/freeMemory functions provided by the calling environment.

Why does the simulation runtime embedded with an FMU need garbage collection at all?

Is it thinkable to further develop the simulation runtime to first have deterministic memory management and then call allocate/freeMemory provided by the calling environment?

comment:6 Changed 10 years ago by adrpo

I will look into fixing the leak. It might however not be a leak as the GC doesn't give the memory back to the system (at least not in Linux).
You need a garbage collector for Modelica models as otherwise you cannot handle while loops in functions because you don't know how much memory is needed beforehand.

comment:7 Changed 10 years ago by rfranke

It doesn't sound intended that the memory needed for a while loop in a Modelica function is not know in advance. Could you post an example?

Couldn't the memory anyhow be allocated with deterministic calls to malloc and free, instead of using a garbage collector?

comment:8 Changed 10 years ago by adrpo

As an example, maybe something like this:

function f
  input String i;
  output String s;
algorithm
  while someExternalFunctionCondition() loop
    s := s + someExternalFunction(i);
  end while;
end f;

model M
  String s = f();
end M;

It might be possible that the memory is allocated deterministic with calls to malloc and free but I think you need a lot of analysis for that and we don't have that yet.
There is another garbage collection available in our runtime, a memory pool.
Maybe we could use that for FMI, I will investigate.

comment:9 Changed 10 years ago by adrpo

Note that a garbage collector will know that previous s can be de-allocated.
Also, if you don't free previous s then you can run out of memory in that loop.
How would you deterministic know when to free the previous s?
If you just de-allocate at the end of the program that doesn't work.

comment:10 Changed 10 years ago by adrpo

  • Owner changed from adeas31 to adrpo
  • Status changed from new to assigned

comment:11 Changed 10 years ago by rfranke

Now I get your point.

A more high level memory pool sounds good for an FMU that is intended to be linked into arbitrary other programs.

comment:12 Changed 10 years ago by sjoelund.se

It's not really a good idea to use a memory pool :) The memory requirements are a lot higher (possibly infinite) since we can not free memory in it until we finish the FMI API call, as comment:9 says.

Anyway, if an FMU must not call any malloc/free, it can not use most existing C libraries (which internally use malloc/free). Making FMU's pretty restrictive considering users can write arbitrary C-code in external C functions (meaning you can not translate Modelica code to FMU's in a general way).

comment:13 Changed 10 years ago by rfranke

The problem with the Boehm garbage collector is that it operates on a low level and it uses heuristics to find out if a piece of memory is still referenced from somewhere else. Meanwhile I saw both possible negative consequences: a memory leak as reported in this ticket and a crash if the FMU is loaded into a program that uses a similar but not the same garbage collector library -- the analysis is not restricted to the memory allocated by the FMU.

The memory pool could be allocated regularly through the calling environment of the FMU. This should also include typical external libraries called from Modelica models (like numerical routines or maybe system calls) that do not allocate internal memory.

Also I would hope that the case of a memory pool with infinite size would be rare or, if it really became critical, could be avoided by reformulating the model or investing into the code generation.

comment:14 Changed 10 years ago by sjoelund.se

It shouldn't be dangerous to mix garbage collectors as this one is conservative. And we make sure that the allocated variables are reachable by the roots.

comment:15 Changed 10 years ago by adrpo

Yes, it should work fine with Boehm GC. The leak is probably because we don't use it :)
However, I think we should have a flag for the FMI generation so we could switch to the memory pool,
so people can have that option if they use the FMU on a very restricted system or for some other reason.

comment:16 follow-up: Changed 10 years ago by adrpo

Rüdiger, how did you generate the FMU? Where is the Modelica model that you use?

comment:17 in reply to: ↑ 16 Changed 10 years ago by adrpo

Replying to adrpo:

Rüdiger, how did you generate the FMU? Where is the Modelica model that you use?

Never mind, I found it: Modelica.Fluid.Examples.DrumBoiler.DrumBoiler.

comment:18 Changed 10 years ago by rfranke

I was using the two Modelica models contained in the HQP distribution. The FMUs are compiled with omc. Then they are loaded once and simulated repeatedly. A simple double integrator does not leak memory anymore since fmi2Terminate is not called, but just fmi2Reset between two subsequent simulations. The DrumBoiler model is basically the one form MSL. It leaks memory -- from what you explained it could be the dynamically allocated record instances for Media functions.

See the commands in the beginning of the ticket -- under Linux you need git, gcc and tcl-dev. Under MinGW it should work as well.

comment:19 Changed 10 years ago by adrpo

I'm now running valgrind on odc running while true {source drumboiler_sp.tcl} to see if it can find out where the leak is. I'll keep you updated.

Changed 10 years ago by adrpo

comment:20 Changed 10 years ago by adrpo

As far as I can tell until now it seems this is the leak:

==3998==    by 0x8BB0111: allocateHybrdData (in /home/adrpo/om/build/lib/omc/libSimulationRuntimeC.so)

comment:21 Changed 10 years ago by sjoelund.se

Is it the single malloc on the data that is missing or is the free function simply never called?

comment:22 Changed 10 years ago by adrpo

As far as I can tell free is never called. But the question I think is how is the FMU used.
I think Rüdiger loads the FMU dll and then initializes the FMU runs it, then resets it for the next run via fmi2Reset. I guess we don't free the memory on fmi2Reset, we just allocate new one. I will have a closer look on how FMI things are working as I have just a vague idea at the moment.

comment:23 Changed 10 years ago by sjoelund.se

Simply changing allocateHybrdData to use GC_malloc is a simple solution ;)

comment:24 Changed 10 years ago by adrpo

These seem to be all the sources of leaks (mostly linear/non-linear/hybrid systems data, jacobians, etc):

==3998==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3998==    by 0x8B9813D: initializeLinearSystems (in /home/adrpo/om/build/lib/omc/libSimulationRuntimeC.so)
==3998==    by 0x842D6BA: fmi2EnterInitializationMode (in /home/adrpo/hqp/odc/.DrumBoiler/binaries/linux64/DrumBoiler.so)


==3998==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3998==    by 0x8B9D6D7: initializeNonlinearSystems (in /home/adrpo/om/build/lib/omc/libSimulationRuntimeC.so)
==3998==    by 0x842D6A7: fmi2EnterInitializationMode (in /home/adrpo/hqp/odc/.DrumBoiler/binaries/linux64/DrumBoiler.so)

==3998==    at 0x4C2CC70: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3998==    by 0x8557F6E: DrumBoiler_initialAnalyticJacobianNLSJac1 (in /home/adrpo/hqp/odc/.DrumBoiler/binaries/linux64/DrumBoiler.so)
==3998==    by 0x8B9835D: initializeLinearSystems (in /home/adrpo/om/build/lib/omc/libSimulationRuntimeC.so)
==3998==    by 0x842D6BA: fmi2EnterInitializationMode (in /home/adrpo/hqp/odc/.DrumBoiler/binaries/linux64/DrumBoiler.so)


==3998==    at 0x4C2CC70: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3998==    by 0x855760E: DrumBoiler_initialAnalyticJacobianNLSJac2 (in /home/adrpo/hqp/odc/.DrumBoiler/binaries/linux64/DrumBoiler.so)
==3998==    by 0x8B9835D: initializeLinearSystems (in /home/adrpo/om/build/lib/omc/libSimulationRuntimeC.so)
==3998==    by 0x842D6BA: fmi2EnterInitializationMode (in /home/adrpo/hqp/odc/.DrumBoiler/binaries/linux64/DrumBoiler.so)

==3998==    at 0x4C2CC70: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3998==    by 0x8556B2E: DrumBoiler_initialAnalyticJacobianNLSJac3 (in /home/adrpo/hqp/odc/.DrumBoiler/binaries/linux64/DrumBoiler.so)
==3998==    by 0x8B9835D: initializeLinearSystems (in /home/adrpo/om/build/lib/omc/libSimulationRuntimeC.so)
==3998==    by 0x842D6BA: fmi2EnterInitializationMode (in /home/adrpo/hqp/odc/.DrumBoiler/binaries/linux64/DrumBoiler.so)

==3998==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3998==    by 0x8B98BC2: allocateLapackData (in /home/adrpo/om/build/lib/omc/libSimulationRuntimeC.so)
==3998==    by 0x8B98328: initializeLinearSystems (in /home/adrpo/om/build/lib/omc/libSimulationRuntimeC.so)
==3998==    by 0x842D6BA: fmi2EnterInitializationMode (in /home/adrpo/hqp/odc/.DrumBoiler/binaries/linux64/DrumBoiler.so)

==3998==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3998==    by 0x8BB0111: allocateHybrdData (in /home/adrpo/om/build/lib/omc/libSimulationRuntimeC.so)
==3998==    by 0x8B9D7F3: initializeNonlinearSystems (in /home/adrpo/om/build/lib/omc/libSimulationRuntimeC.so)
==3998==    by 0x842D6A7: fmi2EnterInitializationMode (in /home/adrpo/hqp/odc/.DrumBoiler/binaries/linux64/DrumBoiler.so)

Yes, we could just use GC_malloc for them. I'll give it a try.

Changed 10 years ago by adrpo

running valgrind on odc running source drumboiler_sp.tcl just once and then exit

comment:25 Changed 10 years ago by adrpo

Rüdiger, you can have a quick look at valgrind-trace-once.txt to see if there are some leaks in the odc code, it seems so, but with valgrind is never sure :).

comment:26 Changed 10 years ago by adrpo

  • Status changed from assigned to accepted

comment:27 Changed 10 years ago by rfranke

Thank you for the tests! It's unlikely that there is a leak in odc. Also note that there is no leak if running the same test with the simpler double integrator model:

% while true {source dic_fmu.tcl}

comment:28 Changed 10 years ago by rfranke

Regarding FMI: fmi2Reset should just reset internal data, whereas fmi2Terminate would free it.

comment:29 Changed 10 years ago by adrpo

It seems that the free of system data is done in fmi2Terminate.
You said that if you use fmi2Terminate + fmi2FreeInstance you get more memory leaks so I will check this usage pattern too.

comment:30 Changed 10 years ago by adrpo

Partial fix in r25105. Now using fmi2Reset + fmi2EnterInitializationMode will not loose memory anymore.
I will check with fmi2Terminate + fmi2FreeInstance later on.

comment:31 Changed 10 years ago by rfranke

The memory leak is gone in my tests as well. The second case is of less importance here, because it is not allowed to call fmi2Terminate after a simulation error. Such an error can occur in single optimization iterations and the optimization solver must be able to re-simulate with different inputs. Fortunately FMI 2.0 has introduced fmi2Reset for such cases.

Btw. possibly fmi2Terminate could keep all memory and just change the state of the FMU, in order to fullfill the spec:

Informs the FMU that the simulation run is terminated. After calling this function, the final values of all variables can be inquired with the fmi2GetXXX(..) functions. It is not allowed to call this function after one of the functions returned with a status flag of fmi2Error or fmi2Fatal.

comment:32 Changed 10 years ago by sjoelund.se

  • Milestone changed from 1.9.2 to 1.9.3

Milestone changed to 1.9.3 since 1.9.2 was released.

comment:33 Changed 9 years ago by sjoelund.se

  • Milestone changed from 1.9.3 to 1.9.4

Moved to new milestone 1.9.4

comment:34 Changed 9 years ago by sjoelund.se

  • Milestone changed from 1.9.4 to 1.9.5

Milestone pushed to 1.9.5

comment:35 Changed 9 years ago by sjoelund.se

  • Milestone changed from 1.9.5 to 1.10.0

Milestone renamed

comment:36 Changed 8 years ago by sjoelund.se

  • Milestone changed from 1.10.0 to 1.11.0

Ticket retargeted after milestone closed

comment:37 Changed 8 years ago by sjoelund.se

  • Milestone changed from 1.11.0 to 1.12.0

Milestone moved to 1.12.0 due to 1.11.0 already being released.

comment:38 Changed 7 years ago by sjoelund.se

Is this ticket resolved already? What is remaining to close it?

comment:39 Changed 7 years ago by casella

  • Milestone changed from 1.12.0 to Future

The milestone of this ticket has been reassigned to "Future".

If you think the issue is still valid and relevant for you, please select milestone 1.13.0 for back-end, code generation and run-time issues, or 2.0.0 for front-end issues.

If you are aware that the problem is no longer present, please select the milestone corresponding to the version of OMC you used to check that, and set the status to "worksforme".

In both cases, a short informative comment would be welcome.

Note: See TracTickets for help on using tickets.