Opened 10 years ago
Last modified 7 years ago
#3067 accepted defect
Memory leak in FMI 2.0 (20 MB/min)
Reported by: | Rüdiger Franke | Owned by: | Adrian Pop |
---|---|---|---|
Priority: | high | Milestone: | Future |
Component: | FMI | Version: | trunk |
Keywords: | Cc: |
Description
Thanks to the solution of a couple of tickets during the last weeks the optimization basing on FMI is working quite well now!
A remaining problem is a memory leak that can be seen in all examples to some extend. Here is an extreme case that leaks about 20 MB/min (depending on the speed of your processor).
The problem can be reproduced by repeatedly running an optimization with HQP. On a machine with omc (r23977), git, gcc and tcl-dev:
$ git clone https://github.com/omuses/hqp.git $ cd hqp $ ./configure $ make $ cd odc $ ./odc % while true {source drumboiler_sp.tcl}
Attachments (2)
Change History (41)
comment:1 by , 10 years ago
comment:2 by , 10 years ago
Yet another observation: the much simpler double integrator model leaks significantly more memory if fmi2Reset is replaced with fmi2FreeInstance/fmi2Instantiate:
$ ./odc % while true {source dic_fmu_est.tcl}
leaks 0.5 MB/min with fmi2Reset vs. 100 MB/min with fmi2FreeInstance/fmi2Instantiate.
comment:3 by , 10 years ago
The current head revision of HQP does not call fmi2Terminate anymore. See:
https://github.com/omuses/hqp/blob/2177ab5739220f70a74cf2604233ea95796b7498/hxi/sfun_fmu.c#L1184
This means that no memory is freed / re-allocated by the subsequent fmi2Reset
(see fmu2_model_interface.c, L490ff):
if (comp->state & modelTerminated) { /* intialize modelData */ fmu2_model_interface_setupDataStruc(comp->fmuData); initializeDataStruc(comp->fmuData); }
This reduces the memory leak from 200 to 25 MB/min for the minimal DrumBoiler example given above -- meaning that fmi2Terminate
and deInitializeDataStruc
called from there leave a lot of unused memory allocated.
The remaining 25 MB/min are almost the same as for the drumboiler_sp optimization. This should mean that the memory is lost during model evaluations (like local data of Media function calls or data for equation systems?).
The good news is that a minimal example that just evaluated the trivial drum boiler example repeatedly does not leak memory anymore if fmi2Terminate
is not called:
while true {puts "--- new run ---"; prg_name DynamicOpt; mdl_path DIC.fmu; mdl_logging 4; prg_setup; prg_simulate}
comment:4 by , 10 years ago
The last statement shall say "trivial double integrator". To summarize correctly:
DIC leaks no memory if fmi2Terminate
is not called:
$ ./odc % while true {puts "--- new run ---"; prg_name DynamicOpt; mdl_path DIC.fmu; mdl_logging 4; prg_setup; prg_simulate}
This points to a leak in the OpenModelica runtime between deInitializeDataStruc
called from fmi2Terminate
and deInitializeDataStruc
called from fmi2Reset
.
DrumBoiler featuring Fluid and Media still leaks 25 MB/min, even though fmi2Terminate
is not called. This leak appears to be proportional to the number of model evaluations.
$ ./odc % while true {puts "--- new run ---"; prg_name DynamicOpt; mdl_path DrumBoiler.fmu; mdl_logging 4; prg_setup; mdl_u0 {0 1}; prg_simulate}
comment:5 by , 10 years ago
The memory leak still persists. The FMU and the linked SimulationRuntimeC appear to use the Boehm garbage collector library. This is dangerous if linked into code that uses a similar garbage collector library, see http://www.hboehm.info/gc/#users
. And it is not according to the FMI specification.
The FMI specification requires that FMUs call allocate/freeMemory functions provided by the calling environment.
Why does the simulation runtime embedded with an FMU need garbage collection at all?
Is it thinkable to further develop the simulation runtime to first have deterministic memory management and then call allocate/freeMemory provided by the calling environment?
comment:6 by , 10 years ago
I will look into fixing the leak. It might however not be a leak as the GC doesn't give the memory back to the system (at least not in Linux).
You need a garbage collector for Modelica models as otherwise you cannot handle while loops in functions because you don't know how much memory is needed beforehand.
comment:7 by , 10 years ago
It doesn't sound intended that the memory needed for a while loop in a Modelica function is not know in advance. Could you post an example?
Couldn't the memory anyhow be allocated with deterministic calls to malloc and free, instead of using a garbage collector?
comment:8 by , 10 years ago
As an example, maybe something like this:
function f input String i; output String s; algorithm while someExternalFunctionCondition() loop s := s + someExternalFunction(i); end while; end f; model M String s = f(); end M;
It might be possible that the memory is allocated deterministic with calls to malloc and free but I think you need a lot of analysis for that and we don't have that yet.
There is another garbage collection available in our runtime, a memory pool.
Maybe we could use that for FMI, I will investigate.
comment:9 by , 10 years ago
Note that a garbage collector will know that previous s can be de-allocated.
Also, if you don't free previous s then you can run out of memory in that loop.
How would you deterministic know when to free the previous s?
If you just de-allocate at the end of the program that doesn't work.
comment:10 by , 10 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:11 by , 10 years ago
Now I get your point.
A more high level memory pool sounds good for an FMU that is intended to be linked into arbitrary other programs.
comment:12 by , 10 years ago
It's not really a good idea to use a memory pool :) The memory requirements are a lot higher (possibly infinite) since we can not free memory in it until we finish the FMI API call, as comment:9 says.
Anyway, if an FMU must not call any malloc/free, it can not use most existing C libraries (which internally use malloc/free). Making FMU's pretty restrictive considering users can write arbitrary C-code in external C functions (meaning you can not translate Modelica code to FMU's in a general way).
comment:13 by , 10 years ago
The problem with the Boehm garbage collector is that it operates on a low level and it uses heuristics to find out if a piece of memory is still referenced from somewhere else. Meanwhile I saw both possible negative consequences: a memory leak as reported in this ticket and a crash if the FMU is loaded into a program that uses a similar but not the same garbage collector library -- the analysis is not restricted to the memory allocated by the FMU.
The memory pool could be allocated regularly through the calling environment of the FMU. This should also include typical external libraries called from Modelica models (like numerical routines or maybe system calls) that do not allocate internal memory.
Also I would hope that the case of a memory pool with infinite size would be rare or, if it really became critical, could be avoided by reformulating the model or investing into the code generation.
comment:14 by , 10 years ago
It shouldn't be dangerous to mix garbage collectors as this one is conservative. And we make sure that the allocated variables are reachable by the roots.
comment:15 by , 10 years ago
Yes, it should work fine with Boehm GC. The leak is probably because we don't use it :)
However, I think we should have a flag for the FMI generation so we could switch to the memory pool,
so people can have that option if they use the FMU on a very restricted system or for some other reason.
follow-up: 17 comment:16 by , 10 years ago
Rüdiger, how did you generate the FMU? Where is the Modelica model that you use?
comment:17 by , 10 years ago
Replying to adrpo:
Rüdiger, how did you generate the FMU? Where is the Modelica model that you use?
Never mind, I found it: Modelica.Fluid.Examples.DrumBoiler.DrumBoiler.
comment:18 by , 10 years ago
I was using the two Modelica models contained in the HQP distribution. The FMUs are compiled with omc. Then they are loaded once and simulated repeatedly. A simple double integrator does not leak memory anymore since fmi2Terminate is not called, but just fmi2Reset between two subsequent simulations. The DrumBoiler model is basically the one form MSL. It leaks memory -- from what you explained it could be the dynamically allocated record instances for Media functions.
See the commands in the beginning of the ticket -- under Linux you need git, gcc and tcl-dev. Under MinGW it should work as well.
comment:19 by , 10 years ago
I'm now running valgrind on odc running while true {source drumboiler_sp.tcl}
to see if it can find out where the leak is. I'll keep you updated.
by , 10 years ago
Attachment: | valgrind-trace.txt added |
---|
comment:20 by , 10 years ago
As far as I can tell until now it seems this is the leak:
==3998== by 0x8BB0111: allocateHybrdData (in /home/adrpo/om/build/lib/omc/libSimulationRuntimeC.so)
comment:21 by , 10 years ago
Is it the single malloc on the data that is missing or is the free function simply never called?
comment:22 by , 10 years ago
As far as I can tell free is never called. But the question I think is how is the FMU used.
I think Rüdiger loads the FMU dll and then initializes the FMU runs it, then resets it for the next run via fmi2Reset. I guess we don't free the memory on fmi2Reset, we just allocate new one. I will have a closer look on how FMI things are working as I have just a vague idea at the moment.
comment:23 by , 10 years ago
Simply changing allocateHybrdData to use GC_malloc is a simple solution ;)
comment:24 by , 10 years ago
These seem to be all the sources of leaks (mostly linear/non-linear/hybrid systems data, jacobians, etc):
==3998== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==3998== by 0x8B9813D: initializeLinearSystems (in /home/adrpo/om/build/lib/omc/libSimulationRuntimeC.so) ==3998== by 0x842D6BA: fmi2EnterInitializationMode (in /home/adrpo/hqp/odc/.DrumBoiler/binaries/linux64/DrumBoiler.so) ==3998== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==3998== by 0x8B9D6D7: initializeNonlinearSystems (in /home/adrpo/om/build/lib/omc/libSimulationRuntimeC.so) ==3998== by 0x842D6A7: fmi2EnterInitializationMode (in /home/adrpo/hqp/odc/.DrumBoiler/binaries/linux64/DrumBoiler.so) ==3998== at 0x4C2CC70: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==3998== by 0x8557F6E: DrumBoiler_initialAnalyticJacobianNLSJac1 (in /home/adrpo/hqp/odc/.DrumBoiler/binaries/linux64/DrumBoiler.so) ==3998== by 0x8B9835D: initializeLinearSystems (in /home/adrpo/om/build/lib/omc/libSimulationRuntimeC.so) ==3998== by 0x842D6BA: fmi2EnterInitializationMode (in /home/adrpo/hqp/odc/.DrumBoiler/binaries/linux64/DrumBoiler.so) ==3998== at 0x4C2CC70: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==3998== by 0x855760E: DrumBoiler_initialAnalyticJacobianNLSJac2 (in /home/adrpo/hqp/odc/.DrumBoiler/binaries/linux64/DrumBoiler.so) ==3998== by 0x8B9835D: initializeLinearSystems (in /home/adrpo/om/build/lib/omc/libSimulationRuntimeC.so) ==3998== by 0x842D6BA: fmi2EnterInitializationMode (in /home/adrpo/hqp/odc/.DrumBoiler/binaries/linux64/DrumBoiler.so) ==3998== at 0x4C2CC70: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==3998== by 0x8556B2E: DrumBoiler_initialAnalyticJacobianNLSJac3 (in /home/adrpo/hqp/odc/.DrumBoiler/binaries/linux64/DrumBoiler.so) ==3998== by 0x8B9835D: initializeLinearSystems (in /home/adrpo/om/build/lib/omc/libSimulationRuntimeC.so) ==3998== by 0x842D6BA: fmi2EnterInitializationMode (in /home/adrpo/hqp/odc/.DrumBoiler/binaries/linux64/DrumBoiler.so) ==3998== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==3998== by 0x8B98BC2: allocateLapackData (in /home/adrpo/om/build/lib/omc/libSimulationRuntimeC.so) ==3998== by 0x8B98328: initializeLinearSystems (in /home/adrpo/om/build/lib/omc/libSimulationRuntimeC.so) ==3998== by 0x842D6BA: fmi2EnterInitializationMode (in /home/adrpo/hqp/odc/.DrumBoiler/binaries/linux64/DrumBoiler.so) ==3998== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==3998== by 0x8BB0111: allocateHybrdData (in /home/adrpo/om/build/lib/omc/libSimulationRuntimeC.so) ==3998== by 0x8B9D7F3: initializeNonlinearSystems (in /home/adrpo/om/build/lib/omc/libSimulationRuntimeC.so) ==3998== by 0x842D6A7: fmi2EnterInitializationMode (in /home/adrpo/hqp/odc/.DrumBoiler/binaries/linux64/DrumBoiler.so)
Yes, we could just use GC_malloc for them. I'll give it a try.
by , 10 years ago
Attachment: | valgrind-trace-once.txt added |
---|
running valgrind on odc running source drumboiler_sp.tcl just once and then exit
comment:25 by , 10 years ago
Rüdiger, you can have a quick look at valgrind-trace-once.txt to see if there are some leaks in the odc code, it seems so, but with valgrind is never sure :).
comment:26 by , 10 years ago
Status: | assigned → accepted |
---|
comment:27 by , 10 years ago
Thank you for the tests! It's unlikely that there is a leak in odc. Also note that there is no leak if running the same test with the simpler double integrator model:
% while true {source dic_fmu.tcl}
comment:28 by , 10 years ago
Regarding FMI: fmi2Reset should just reset internal data, whereas fmi2Terminate would free it.
comment:29 by , 10 years ago
It seems that the free of system data is done in fmi2Terminate.
You said that if you use fmi2Terminate + fmi2FreeInstance you get more memory leaks so I will check this usage pattern too.
comment:30 by , 10 years ago
Partial fix in r25105. Now using fmi2Reset + fmi2EnterInitializationMode will not loose memory anymore.
I will check with fmi2Terminate + fmi2FreeInstance later on.
comment:31 by , 10 years ago
The memory leak is gone in my tests as well. The second case is of less importance here, because it is not allowed to call fmi2Terminate
after a simulation error. Such an error can occur in single optimization iterations and the optimization solver must be able to re-simulate with different inputs. Fortunately FMI 2.0 has introduced fmi2Reset
for such cases.
Btw. possibly fmi2Terminate
could keep all memory and just change the state of the FMU, in order to fullfill the spec:
Informs the FMU that the simulation run is terminated. After calling this function, the final values of all variables can be inquired with the fmi2GetXXX(..) functions. It is not allowed to call this function after one of the functions returned with a status flag of fmi2Error or fmi2Fatal.
comment:32 by , 10 years ago
Milestone: | 1.9.2 → 1.9.3 |
---|
Milestone changed to 1.9.3 since 1.9.2 was released.
comment:37 by , 8 years ago
Milestone: | 1.11.0 → 1.12.0 |
---|
Milestone moved to 1.12.0 due to 1.11.0 already being released.
comment:39 by , 7 years ago
Milestone: | 1.12.0 → Future |
---|
The milestone of this ticket has been reassigned to "Future".
If you think the issue is still valid and relevant for you, please select milestone 1.13.0 for back-end, code generation and run-time issues, or 2.0.0 for front-end issues.
If you are aware that the problem is no longer present, please select the milestone corresponding to the version of OMC you used to check that, and set the status to "worksforme".
In both cases, a short informative comment would be welcome.
Let me add some more infos. It appears that memory leaks if fmi2Reset or fmi2FreeInstance are called.
Here is a minimal example that just instantiates the model and evaluates it at t=0. Moreover all FMI calls are logged:
This example leaks about 200 MB/minute. Ten times more than the optimization run above because the model is reset more frequently. The leak can be reduced to 100 MB/minute if the call to fmi2Reset gets replaced with a complete new instantiation, i.e. fmi2FreeInstance followed by fmi2Instantiate (see hqp/hxi/sfun_fmu.c, line 827ff -- https://github.com/omuses/hqp/blob/8b83fce8da213131c0ce43f3a0c91d28c82dd6d6/hxi/sfun_fmu.c#L827).