Opened 6 years ago

Closed 6 years ago

#5049 closed defect (fixed)

Numerical issues close to steady-state in ODE mode

Reported by: Francesco Casella Owned by: Willi Braun
Priority: blocker Milestone: 1.13.0
Component: Run-time Version:
Keywords: Cc: marianne.saugier@…, adrien.guironnet@…, Andrea Bartolini

Description

Please check the PowerGrids.IPSLValidation.Examples.SmallCase.OpenLoopLoadStepPowerGrids model. One can monitor the status of the simulation by looking at gen_pwGenerator4W__GEN______SM.CePu which is the electrical torque in per unit. The model starts in steady-state, then a step change is applied to the load at time = 1. What should ensue is a damped oscillation which eventually dies out to a new steady state.

The simulation currently runs fine because it uses --daeMode=new. However, if the corresponding vendor annotation is removed and the standard ODE mode integration is used, a weird thing happens: at time = 21.31 the CePu variable suddenly drops to zero for a short time, then it does so again around time = 23.8, and eventually a new transient similar to the first is actually triggered (see dassl.png attachment). This makes absolutely no sense, as the system should stay at steady state indefinitely for time > 20.

If I change the solver from DASSL to IDA, the result is different but qualitatively similar, see ida.png.

I guess for some reason the causalized code is numerically unstable close to steady state, and thus tricks the ODE solver in some unpredictable way.

The root cause of this behaviour should be identified and the issue removed before the 1.13.0 release, because it may also affect other models belonging to different domains.

Attachments (4)

dassl.PNG (22.3 KB ) - added by Francesco Casella 6 years ago.
ida.PNG (19.4 KB ) - added by Francesco Casella 6 years ago.
step-sizes-standard-solver.png (6.8 KB ) - added by Francesco Casella 6 years ago.
step-sizes-kinsol.png (6.0 KB ) - added by Francesco Casella 6 years ago.

Download all attachments as: .zip

Change History (13)

by Francesco Casella, 6 years ago

Attachment: dassl.PNG added

by Francesco Casella, 6 years ago

Attachment: ida.PNG added

comment:1 by Francesco Casella, 6 years ago

Cc: marianne.saugier@… adrien.guironnet@… Andrea Bartolini added

comment:2 by Francesco Casella, 6 years ago

I analyzed the issue with @wbraun. The problem is the default nonlinear solver that solves the equations for iqPu, idPu, iQ1Pu, iQ2Pu, iDPu, etc. At each new time step, the solver makes a linear extrapolation of the previous solution to compute an initial guess for the iterations. For some reason, when the solution is close to steady-state, this extrapolation breaks badly, producing very large results (1e10 or more), which alter the scaling values of the solver, which in turn accepts whatever bad solution it finds because of the bogus scaling.

In fact, the problem does not show up if one use kinsol as nonlinear solver instead.

@wbraun will further investigate the root cause of this issue and fix it asap on the nightly build.

Version 0, edited 6 years ago by Francesco Casella (next)

comment:3 by Willi Braun, 6 years ago

Fixed in 6a19517/OMCompiler.

comment:4 by Willi Braun, 6 years ago

Resolution: fixed
Status: newclosed

in reply to:  3 comment:5 by Francesco Casella, 6 years ago

Replying to wbraun:

Fixed in 6a19517/OMCompiler.

This fixed the issue with the extrapolation of the solution at the previous step, so that a correct solution is now found.

As discussed with @wbraun over e-mail, there is a remaining issue, which only affects the efficiency of the solution process, not its correctness. If one simulates OpenLoopLoadStepPowerGrids with the standard solver, StopTime = 400 and -noEquidistantTimeGrid, in order to see the actual solver steps in the solution, it is apparent how the step adaptation mechanism is not working correctly. The state variables of the system completely stop oscillating around time = 40, and reach a steady state around time = 120. One would expect a high-order implicit solver such as DASSL to increase the step size significantly between time = 40 and time = 120, and then to take increasingly long step sizes as the solution has reached the steady-state.

This in fact does not happen at all, as shown in the attached step-sizes-standard-solver.png figure. In fact, the solver starts increasing the step size around time = 80, but then goes back to short time steps around 0.01 seconds. Later on, it manages to increase the step size to 10-20 s between time = 150 and time = 220, but then goes back to shorter time steps.

by Francesco Casella, 6 years ago

comment:6 by Francesco Casella, 6 years ago

The culprit is the default nonlinear solver, which causes numerous convergence failures (1480 for a 400 s simulation). If kinsol is selected instead, there are 0 convergence failures and the step sizes increases as expected once the solution gets close to steady-state, see step-sizes-kinsol.png.

By comparing the two step-size diagrams, it is apparent how the standard solver has issues as soon as the step size goes above 0.1 (which happens around time = 20); when this happens, sooner or later there are solver failures which are only recovered when the ODE solver step size is reduced below 0.01 s.

Although the test case under consideration contains non linear equations that need to be solved to causalize the system, all the involved variables have very small variations after time = 20, so that the system is essentially working in a linear transient regime. Thus, there should be no convergence issues at all (which is in fact what happens when kinsol is used).

The fact that failures are likely to occur when the step-size is increased is quite difficult to explain, as the solution is close to steady-state, so there is no dependency between the step length and the solution of the system required to get the right-hand side of the ODEs f(x,t).

The only possible connection is the extrapolation routine that computes the initial guess for the solver, which I suspect still has serious issues that were not fixed by 6a19517/OMCompiler. BTW, I guess that the lower bound of the steps (around 0.015 s) that I see in the step size plot is related to the MINIMAL_STEP_SIZE I see here.

Please review the interpolation routine until the problem reported here is fixed. It may be possible that there is some issue with incorrectly handeld scaling of unknowns, the system causing the failure has variables with orders of magnitude ranging from 1 to 1e5.

by Francesco Casella, 6 years ago

Attachment: step-sizes-kinsol.png added

comment:7 by Francesco Casella, 6 years ago

Resolution: fixed
Status: closedreopened
Summary: Numerical issues with power system models close to steady-state in ODE modeNumerical issues close to steady-state in ODE mode with default nonlinear solver

in reply to:  6 ; comment:8 by Willi Braun, 6 years ago

Replying to casella:

The culprit is the default nonlinear solver, which causes numerous convergence failures (1480 for a 400 s simulation). If kinsol is selected instead, there are 0 convergence failures and the step sizes increases as expected once the solution gets close to steady-state, see step-sizes-kinsol.png.

By comparing the two step-size diagrams, it is apparent how the standard solver has issues as soon as the step size goes above 0.1 (which happens around time = 20); when this happens, sooner or later there are solver failures which are only recovered when the ODE solver step size is reduced below 0.01 s.

Although the test case under consideration contains non linear equations that need to be solved to causalize the system, all the involved variables have very small variations after time = 20, so that the system is essentially working in a linear transient regime. Thus, there should be no convergence issues at all (which is in fact what happens when kinsol is used).

The fact that failures are likely to occur when the step-size is increased is quite difficult to explain, as the solution is close to steady-state, so there is no dependency between the step length and the solution of the system required to get the right-hand side of the ODEs f(x,t).

The only possible connection is the extrapolation routine that computes the initial guess for the solver, which I suspect still has serious issues that were not fixed by 6a19517/OMCompiler.

Thank you for that analysis, but we use the same extrapolation routine for all non-linear solvers, so if the extrapolation would have serious issue then we would it also see it in the kinsol solution. Thus it's most likely not the input, but rather the output of the non-linear solver.

BTW, I guess that the lower bound of the steps (around 0.015 s) that I see in the step size plot is related to the MINIMAL_STEP_SIZE I see here.

Yes, we use MINIMAL_STEP_SIZE=1e-12 as internal lower bound, but I it is not hit here. There it is used to distinguish between to different steps.

Please review the interpolation routine until the problem reported here is fixed. It may be possible that there is some issue with incorrectly handeld scaling of unknowns, the system causing the failure has variables with orders of magnitude ranging from 1 to 1e5.

I tend to close this ticket under the title "Numerical issues close to steady-state in ODE mode" and start a new one regarding the issue of the unstable solution behavior default non-linear solver, since this was a hard numerical bug and the new raised issue is more a performance bug.

in reply to:  8 comment:9 by Francesco Casella, 6 years ago

Resolution: fixed
Status: reopenedclosed
Summary: Numerical issues close to steady-state in ODE mode with default nonlinear solverNumerical issues close to steady-state in ODE mode

Replying to wbraun:

I tend to close this ticket under the title "Numerical issues close to steady-state in ODE mode" and start a new one regarding the issue of the unstable solution behavior default non-linear solver, since this was a hard numerical bug and the new raised issue is more a performance bug.

Agreed, see #5059.

Note: See TracTickets for help on using tickets.