Opened 4 years ago

Closed 4 years ago

#6275 closed defect (fixed)

--daeMode segfault with -idaLS=klu

Reported by: Andreas Heuermann Owned by: Andreas Heuermann
Priority: blocker Milestone: 1.17.0
Component: Run-time Version: v1.17.0-dev
Keywords: sundials, ida, klu, daeMode Cc:

Description

Reported by hanshell in the forum. Original post: https://openmodelica.org/forum/default-topic/3155-daemode-segfault-with-idals-klu

simulation option: -idaLS=klu -lv LOG_SOLVER

Limited backtrace at point of segmentation fault
/lib64/libpthread.so.0(+0x132d0)[0x7f3d054022d0]
/usr/local/bin/../lib/x86_64-linux-gnu/omc/libsundials_sunlinsolklu.so.3(SUNLinSolSolve_KLU+0x92)[0x7f3d03b70f52]
/usr/local/bin/../lib/x86_64-linux-gnu/omc/libsundials_idas.so.4(idaLsSolve+0x1b7)[0x7f3d02860e17]
/usr/local/bin/../lib/x86_64-linux-gnu/omc/libsundials_idas.so.4(IDACalcIC+0x6b3)[0x7f3d0285df23]
/usr/local/bin/../lib/x86_64-linux-gnu/omc/libSimulationRuntimeC.so(ida_event_update+0x2b1)[0x7f3d08404021]
/usr/local/bin/../lib/x86_64-linux-gnu/omc/libSimulationRuntimeC.so(updateDiscreteSystem+0x89)[0x7f3d083ee969]
/usr/local/bin/../lib/x86_64-linux-gnu/omc/libSimulationRuntimeC.so(initialization+0x40b)[0x7f3d0841886b]
/usr/local/bin/../lib/x86_64-linux-gnu/omc/libSimulationRuntimeC.so(initializeModel+0xde)[0x7f3d083f9cce]
/usr/local/bin/../lib/x86_64-linux-gnu/omc/libSimulationRuntimeC.so(solver_main+0x140)[0x7f3d083fae30]
/usr/local/bin/../lib/x86_64-linux-gnu/omc/libSimulationRuntimeC.so(+0xd3e41)[0x7f3d08441e41]
/usr/local/bin/../lib/x86_64-linux-gnu/omc/libSimulationRuntimeC.so(startNonInteractiveSimulation+0xd1c)[0x7f3d08440f5c]
/usr/local/bin/../lib/x86_64-linux-gnu/omc/libSimulationRuntimeC.so(_main_SimulationRuntime+0x72)[0x7f3d08443fb2]
./DFIM10(main+0x19b)[0x416ad1]
/lib64/libc.so.6(__libc_start_main+0xea)[0x7f3d0505834a]
./DFIM10(_start+0x2a)[0x41490a]
LOG_SOLVER        | info    | NO override given on the command line.
LOG_SOLVER        | info    | numberOfIntervals = 10500
LOG_SOLVER        | info    | Allocated simulation result data storage for method 'mat' and file='DFIM10_res.mat'
LOG_SOLVER        | info    | recognized solver: ida
LOG_SOLVER        | info    | Initializing IDA DAE Solver
LOG_SOLVER        | info    | ## IDA ## Initializing solver of size 111 in DAE mode.
|                 | |       | | The relative tolerance is 1e-08. Following absolute tolerances are used for the states:
|                 | |       | | IDA uses internal root finding method YES
|                 | |       | | Maximum integration order 5
|                 | |       | | use equidistant time grid YES
|                 | |       | | IDA linear solver method selected ida use sparse direct solver KLU. (default)
|                 | |       | | Jacobian is calculated by "Colored numerical Jacobian, which is default for dassl and ida. With option -idaLS=klu a sparse matrix is used."
|                 | |       | | initial step size is set automatically.
LOG_SOLVER        | info    | ##IDA## do event update at -0.5
LOG_SOLVER        | info    | ##IDA## corrected step-size at 2.22044604925031e-16
LOG_SOLVER        | info    | #### IDA error message #####
|                 | |       | |  -> error code -22
|                 | |       | |  -> module IDAS
|                 | |       | |  -> function IDACalcIC
|                 | |       | |  Message: tout1 too close to t0 to attempt initial condition calculation.
LOG_SOLVER        | info    | ##IDA## IDACalcIC run status -22.
|                 | |       | Iterations : 0
LOG_SOLVER        | info    | ##IDA## first event iteration failed. Start next try without line search!

simulation option -idaLS=dense -lv LOG_SOLVER
results as expected

LOG_SOLVER        | info    | NO override given on the command line.
LOG_SOLVER        | info    | numberOfIntervals = 10500
LOG_SOLVER        | info    | Allocated simulation result data storage for method 'mat' and file='DFIM10_res.mat'
LOG_SOLVER        | info    | recognized solver: ida
LOG_SOLVER        | info    | Initializing IDA DAE Solver
LOG_SOLVER        | info    | ## IDA ## Initializing solver of size 111 in DAE mode.
|                 | |       | | The relative tolerance is 1e-08. Following absolute tolerances are used for the states:
|                 | |       | | IDA uses internal root finding method YES
|                 | |       | | Maximum integration order 5
|                 | |       | | use equidistant time grid YES
|                 | |       | | IDA linear solver method selected ida internal dense method.
|                 | |       | | Jacobian is calculated by "Colored numerical Jacobian, which is default for dassl and ida. With option -idaLS=klu a sparse matrix is used."
|                 | |       | | initial step size is set automatically.
LOG_SOLVER        | info    | ##IDA## do event update at -0.5
LOG_SOLVER        | info    | ##IDA## corrected step-size at 2.22044604925031e-16
LOG_SOLVER        | info    | #### IDA error message #####
|                 | |       | |  -> error code -22
|                 | |       | |  -> module IDAS
|                 | |       | |  -> function IDACalcIC
|                 | |       | |  Message: tout1 too close to t0 to attempt initial condition calculation.
LOG_SOLVER        | info    | ##IDA## IDACalcIC run status -22.
|                 | |       | Iterations : 0
LOG_SOLVER        | info    | ##IDA## first event iteration failed. Start next try without line search!
LOG_SOLVER        | info    | ##IDA## IDACalcIC run status 0.
|                 | |       | Iterations : 0
LOG_SUCCESS       | info    | The initialization finished successfully without homotopy method.
LOG_SOLVER        | info    | Wrote parameters to the file after initialization (for output formats that support this)
LOG_SOLVER        | info    | Start numerical solver from -0.5 to 10
LOG_SOLVER        | info    | Sparse structure of DAE sparse pattern [size: 111x111]
|                 | |       | | 393 nonzero elements
|                 | |       | | Transposed sparse structure (rows: states)

with version v1.16.0 sparse direct solver was working.

OMC version: OMCompiler v1.17.0-dev.224+g4c291412fa
OS: x86_64 x86_64 x86_64 GNU/Linux

Attachments (1)

GDB.dae-report.zip (4.1 KB ) - added by anonymous 4 years ago.

Download all attachments as: .zip

Change History (9)

comment:1 by Andreas Heuermann, 4 years ago

I can't find a model where I get a segmentation fault on Ubuntu 20.04.
Sure there are models which can't be solved with daeMode, but that's a different issue and those can't be solve with v1.16.1 omc as well.

So to fix this issue I need a minimal working example.

comment:2 by Mahder Alemseged Gebremedhin, 4 years ago

You are talking to yourself Andreas. Did you forget to CC the person who reported the problem. Or are you just writing to your future self?

comment:3 by Andreas Heuermann, 4 years ago

I wrote in the forum post as well. Not sure how to CC that person here.

by anonymous, 4 years ago

Attachment: GDB.dae-report.zip added

comment:4 by anonymous, 4 years ago

problem update:

  1. segfault with all models (simple & complex)
  2. some additional information from GDB in attached "GDB.dae-report.zip"

I'm using:

clang version 7.0.1 (tags/RELEASE_701/final 349238)
gcc (SUSE Linux) 7.5.0
cmake version 3.10.2
OMCompiler v1.17.0-dev.224+g4c291412fa

comment:5 by Andreas Heuermann, 4 years ago

Now I could reproduce your error and will start to fix it.

The strange thing is that it only fails for me, if I use -lv=LOG_JAC,LOG_DEBUG. If I remove one of the logs it runs. So I'm fairly confident that I'll find the bug.
Can you confirm, that your model runs when disabling the logs?

Also I noticed that you are using the old frontend in the script you added to this ticket. I suggest to add setCommandLineOptions("-d=newInst);" to your models (if you are using mos scripts at all, otherwise ignore my comment. OMEdit will us the new frontend per default).

comment:6 by Andreas Heuermann, 4 years ago

It's a classic buffer overflow. I opened a PR for it: https://github.com/OpenModelica/OpenModelica/pull/7039

comment:7 by Johann Hell <hans.hell@…>, 4 years ago

!!problem SOLVED!!

After removing a 3 years old SuiteSparse installation (libs&includes) form my /usr/local/lib/ and new compilation of omc everything is fine.

I also found the problem with the combination of different LOG's -lv=LOG_JAC,LOG_DEBUG. But this was not the root cause.
Thx for patient & support

comment:8 by Andreas Heuermann, 4 years ago

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.