Opened 8 years ago

Closed 7 years ago

#4423 closed defect (fixed)

Improve LateInline of if-statements and records

Reported by: Francesco Casella Owned by: Patrick Täuber
Priority: high Milestone: 1.13.0
Component: Backend Version:
Keywords: Cc: Patrick Täuber

Description

The simulation performance of Modelica.Fluid.Examples.DrumBoiler.DrumBoiler is substantially slower than with Dymola.

Some investigation reveals that in OMC generated code there are still some references to the rhovl_p() function of the IF97 medium model, which entails the need of solving a nonlinear system with iterations. That function has a lateInline=true annotation; if that is correctly taken into account, the system becomes linear and the performance much faster as a consequence, as there is no longer any need for iterations.

Why isn't that function inlined?

Change History (19)

comment:1 by Patrick Täuber, 8 years ago

Status: newaccepted

comment:2 by Patrick Täuber, 8 years ago

It seems that the implementation of inlining functions is not finished in all detail yet. Some investigations showed that it is not possible to inline even much simpler functions. Also the new inline method which does not substitute the function but creates new variables which are added to the equation system (--inlineMethod=append) is not able to handle more difficult cases. One problem seems to be the handling of records.
By the way, Dymola does not (late-)inline the rhovl_p() function as well. Only the derivative function rhovl_p_der() is inlined which appears in the loop.

in reply to:  2 comment:3 by Francesco Casella, 8 years ago

Replying to ptaeuber:

By the way, Dymola does not (late-)inline the rhovl_p() function as well. Only the derivative function rhovl_p_der() is inlined which appears in the loop.

You are right, I wrote the ticket in a haste. Do you think handling that specific case would be doable?

comment:4 by Francesco Casella, 7 years ago

Milestone: 1.12.01.13.0

Milestone changed to 1.13.0 since 1.12.0 was released

comment:5 by Martin Sjölund, 7 years ago

The code for that function is:

if bpro.region3boundary then
  h_der := ((bpro.d*bpro.pd - bpro.T*bpro.pt)*p_der + (bpro.T*bpro.pt*
    bpro.pt + bpro.d*bpro.d*bpro.pd*bpro.cv)/bpro.dpT*p_der)/(bpro.pd*
    bpro.d*bpro.d);
else
  h_der := (1/bpro.d - bpro.T*bpro.vt)*p_der + bpro.cp/bpro.dpT*p_der;
end if;

But we don't support inline of if-statements; only if-expressions such as:

h_der := if bpro.region3boundary then
  then ((bpro.d*bpro.pd - bpro.T*bpro.pt)*p_der + (bpro.T*bpro.pt*bpro.pt + bpro.d*bpro.d*bpro.pd*bpro.cv)/bpro.dpT*p_der)/(bpro.pd*bpro.d*bpro.d)
  else (1/bpro.d - bpro.T*bpro.vt)*p_der + bpro.cp/bpro.dpT*p_der;

If MSL was updated, we would inline the code. I'll see if I can detect this pattern in the inline module.

comment:6 by Martin Sjölund, 7 years ago

PR2076 handles if-statements to if-expressions. But it doesn't seem to make much difference for the DrumBoiler model because it only handles half the calls to rhovl_p_der.

Fail:

Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.rhovl_p_der(evaporator.p, Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.boilingcurve_p(evaporator.p), der(evaporator.p))

OK:

Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.rhovl_p_der(evaporator.p, $cse1, $DER.evaporator.p) => if noEvent($cse1.region3boundary) then ($DER.evaporator.p - $cse1.pt * $DER.evaporator.p / $cse1.dpT) / $cse1.pd else (-$cse1.d ^ 2.0) * ($cse1.vp + $cse1.vt / $cse1.dpT) * $DER.evaporator.p

Probably because we don't support DAE.RSUB of the call when inlining. If we do add support for it, the CSE stuff would need to handle it because otherwise we make duplicate calls to boilingcurve_p.

comment:7 by Martin Sjölund, 7 years ago

So I have some local code that will inline things like:

SteamValve._state_a._d = Modelica.Media.Water.IF97_Utilities.rho_props_ph(evaporator.p, evaporator.h_v, Modelica.Media.Water.IF97_Utilities.waterBaseProp_ph(evaporator.p, evaporator.h_v, 0, 0))
der(evaporator._rho_v) = Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.rhovl_p_der(evaporator.p, $cse1, der(evaporator.p))

Into:

SteamValve._state_a._d = Modelica.Media.Water.IF97_Utilities.waterBaseProp_ph(evaporator.p, evaporator.h_v, 0, 0).rho
der(evaporator._rho_v) = if $cse1.region3boundary then DIVISION(der(evaporator.p) - $cse1.pt * DIVISION(der(evaporator.p), $cse1.dpT), $cse1.pd) else (-$cse1.d ^ 2.0) * ($cse1.vp + DIVISION($cse1.vt, $cse1.dpT)) * der(evaporator.p)

I suppose it's a slight improvement. But there is still code duplicating the Modelica.Media.Water.IF97_Utilities.waterBaseProp_ph(evaporator.p, evaporator.h_v, 0, 0) call. Only these changes don't affect performance much, but adding a simplification for RSUB($cse10,rho) -> $cse10.rho does help slightly (~10% of total time); the only RSUB() expressions remaining then seem to be in initialization (and those only run once; although as I wrote above, those function calls do not become cse variables and could be slightly bad for performance).

I suppose pushing my changes would break the C++ code generator since we only used RSUB expressions in MetaModelica before, so I will fix that before making the PR.

comment:8 by Martin Sjölund, 7 years ago

A lot of time (~80%) is spent on the non-linear torn system, and the heaviest part of that is calls to things like:

(torn) der(evaporator.rho_v) := if Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.dewcurve_p(evaporator.p).region3boundary then DIVISION(der(evaporator.p) - Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.dewcurve_p(evaporator.p).pt * DIVISION(der(evaporator.p), Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.dewcurve_p(evaporator.p).dpT), Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.dewcurve_p(evaporator.p).pd) else (-Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.dewcurve_p(evaporator.p).d ^ 2.0) * (Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.dewcurve_p(evaporator.p).vp + DIVISION(Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.dewcurve_p(evaporator.p).vt, Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.dewcurve_p(evaporator.p).dpT)) * der(evaporator.p)

Yes, the same expression calls Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.dewcurve_p(evaporator.p) over and over again. And if I'm not mistaken, it could be calculated once before the NLS (because evaporator.p is known). But there is an if-expression involved and probably the CSE code is skipped for that (since if the call is in only one branch you shouldn't call it unless needed); but we could do the CSE if it is called in the condition or in both branches (and here both hold).

comment:9 by Martin Sjölund, 7 years ago

For some reason my changes cause inline during initialization to generate a broken dependency graph, and we get (residual) 1.0 - /*Real*/ volume1.medium.T = 0 where the iteration variable is volume1.medium.h which is simply wrong.

in reply to:  8 comment:10 by Francesco Casella, 7 years ago

Replying to sjoelund.se:

Yes, the same expression calls Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.dewcurve_p(evaporator.p) over and over again. And if I'm not mistaken, it could be calculated once before the NLS (because evaporator.p is known). But there is an if-expression involved and probably the CSE code is skipped for that (since if the call is in only one branch you shouldn't call it unless needed); but we could do the CSE if it is called in the condition or in both branches (and here both hold).

Sounds like a good idea.

comment:11 by Martin Sjölund, 7 years ago

So my PR broke some tests...

Part of the problem seems to be with index reduction where d/dtime of the following expression is wrong:

1.184307920059215e-05 * Buildings.Fluid.Interfaces.ConservationEquation$volDyn$dynBal.Medium.ThermodynamicState(volDyn.dynBal.medium.p, 273.15 + volDyn.dynBal.hOut / 1014.54 + -24.65170914897392, {0.01, 0.99}).p
1.184307920059215e-05 * Buildings.Fluid.Interfaces.ConservationEquation$volDyn$dynBal.Medium.ThermodynamicState(0.0, 0.0, 0.0, 0.0).p

The derivative of an array argument to a function is not: add N arguments and pass them to the function...

comment:12 by Martin Sjölund, 7 years ago

https://github.com/OpenModelica/OMCompiler/pull/2093 fixes the problem with differentiation changing {0,0} to 0,0, and also improves things by not needing to differentiate the record at all.

comment:13 by Martin Sjölund, 7 years ago

Summary: LateInline annotation is not accounted for in Modelica.MediaImprove LateInline of if-statements and records

comment:8 is considered in https://github.com/OpenModelica/OMCompiler/pull/2094. I guess @casella should compare the drum boiler using the next OM nightly build and Dymola.

comment:14 by Martin Sjölund, 7 years ago

Some fixed tests (+88, vs. previous -94).

Still broken, 6xThermoPower, and some performance tests that did not improve enough to trigger. Still good that only one library is failing for some tricky cases...

comment:15 by Martin Sjölund, 7 years ago

Seems the ThermoPower models were not quite that badly affected. Since before Christmas there is no change, so I probably improved and then broke them again.

What is affected is models like: Modelica.Magnetic.QuasiStatic.FundamentalWave.Examples.BasicMachines.SynchronousMachines.SMPM_CurrentSource (reference to negative alias remaining during code generation, and would give wrong result if we didn't now abort)

Performance halved in:

ModelicaTest.Fluid.TestComponents.Machines.TestWaterPumpDCMotor
ModelicaTest.Fluid.TestComponents.Machines.TestWaterPumpDCMotorHeatTransfer
ModelicaTest.Fluid.TestExamplesVariants.IncompressibleFluidNetwork_StandardWaterOnePhase

comment:16 by Francesco Casella, 7 years ago

Cc: Patrick Täuber added

I tried this out with v1.13.0-dev-329-gf16fbfd on Windows 64 bit. The simulation time is now 0.21 s on my PC, compared to 0.063 using Dymola, which is about 3X slower. Unfortunately I did not record what was the slowdown factor when I opened the ticket, but it was much larger, if I recall correctly.

The original issue, i.e., a nonlinear system involving rhovl_p(), seems to be solved. There is still on nonlinear system to compute der(evaporator.p) and der(evaporator.Vl) that involves calling tsat_der() and some other computations.

The simulation takes 346 steps with 71 numerical jacobian computation; about half of the simulation time is spent solving a nonlinear loop involving the computation of tsat_der (which Dymola doesn't have), more precisely 22% computing the function and 27% solving the loop, which calls the function 12494 times.

From what I understand, this is no longer an issue of inlining or CSE. Rather, the system involving der(evaporator.p) and der(evaporator.Vl) is indeed linear, and should be solved with a linear solver, not with an iterative nonlinear solver. This is what Dymola actually does. Maybe the conditional equations are fooling the tearing algorithm somehow. Solving this system as a linear one should bring the performance very close to Dymola's.

@ptaeuber, could you have a look? The model is always the DrumBoiler example from Modelica.Fluid

comment:17 by Vitalij Ruge, 7 years ago

The functionTree is missing for check linear or non-linear.
Related issues e.g. #3511

Last edited 7 years ago by Vitalij Ruge (previous) (diff)

comment:18 by Patrick Täuber, 7 years ago

The linearity of the system is determined by the jacobian, which is missing here.
Same issue as in #4374.

comment:19 by Francesco Casella, 7 years ago

Resolution: fixed
Status: acceptedclosed

OK, I just opened #4711 on this specific subject, as LateInline is no longer involved.

Note: See TracTickets for help on using tickets.