Opened 7 years ago

Closed 7 years ago

#4423 closed defect (fixed)

Improve LateInline of if-statements and records

Reported by: casella Owned by: ptaeuber
Priority: high Milestone: 1.13.0
Component: Backend Version:
Keywords: Cc: ptaeuber

Description

The simulation performance of Modelica.Fluid.Examples.DrumBoiler.DrumBoiler is substantially slower than with Dymola.

Some investigation reveals that in OMC generated code there are still some references to the rhovl_p() function of the IF97 medium model, which entails the need of solving a nonlinear system with iterations. That function has a lateInline=true annotation; if that is correctly taken into account, the system becomes linear and the performance much faster as a consequence, as there is no longer any need for iterations.

Why isn't that function inlined?

Change History (19)

comment:1 Changed 7 years ago by ptaeuber

  • Status changed from new to accepted

comment:2 follow-up: Changed 7 years ago by ptaeuber

It seems that the implementation of inlining functions is not finished in all detail yet. Some investigations showed that it is not possible to inline even much simpler functions. Also the new inline method which does not substitute the function but creates new variables which are added to the equation system (--inlineMethod=append) is not able to handle more difficult cases. One problem seems to be the handling of records.
By the way, Dymola does not (late-)inline the rhovl_p() function as well. Only the derivative function rhovl_p_der() is inlined which appears in the loop.

comment:3 in reply to: ↑ 2 Changed 7 years ago by casella

Replying to ptaeuber:

By the way, Dymola does not (late-)inline the rhovl_p() function as well. Only the derivative function rhovl_p_der() is inlined which appears in the loop.

You are right, I wrote the ticket in a haste. Do you think handling that specific case would be doable?

comment:4 Changed 7 years ago by casella

  • Milestone changed from 1.12.0 to 1.13.0

Milestone changed to 1.13.0 since 1.12.0 was released

comment:5 Changed 7 years ago by sjoelund.se

The code for that function is:

if bpro.region3boundary then
  h_der := ((bpro.d*bpro.pd - bpro.T*bpro.pt)*p_der + (bpro.T*bpro.pt*
    bpro.pt + bpro.d*bpro.d*bpro.pd*bpro.cv)/bpro.dpT*p_der)/(bpro.pd*
    bpro.d*bpro.d);
else
  h_der := (1/bpro.d - bpro.T*bpro.vt)*p_der + bpro.cp/bpro.dpT*p_der;
end if;

But we don't support inline of if-statements; only if-expressions such as:

h_der := if bpro.region3boundary then
  then ((bpro.d*bpro.pd - bpro.T*bpro.pt)*p_der + (bpro.T*bpro.pt*bpro.pt + bpro.d*bpro.d*bpro.pd*bpro.cv)/bpro.dpT*p_der)/(bpro.pd*bpro.d*bpro.d)
  else (1/bpro.d - bpro.T*bpro.vt)*p_der + bpro.cp/bpro.dpT*p_der;

If MSL was updated, we would inline the code. I'll see if I can detect this pattern in the inline module.

comment:6 Changed 7 years ago by sjoelund.se

PR2076 handles if-statements to if-expressions. But it doesn't seem to make much difference for the DrumBoiler model because it only handles half the calls to rhovl_p_der.

Fail:

Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.rhovl_p_der(evaporator.p, Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.boilingcurve_p(evaporator.p), der(evaporator.p))

OK:

Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.rhovl_p_der(evaporator.p, $cse1, $DER.evaporator.p) => if noEvent($cse1.region3boundary) then ($DER.evaporator.p - $cse1.pt * $DER.evaporator.p / $cse1.dpT) / $cse1.pd else (-$cse1.d ^ 2.0) * ($cse1.vp + $cse1.vt / $cse1.dpT) * $DER.evaporator.p

Probably because we don't support DAE.RSUB of the call when inlining. If we do add support for it, the CSE stuff would need to handle it because otherwise we make duplicate calls to boilingcurve_p.

comment:7 Changed 7 years ago by sjoelund.se

So I have some local code that will inline things like:

SteamValve._state_a._d = Modelica.Media.Water.IF97_Utilities.rho_props_ph(evaporator.p, evaporator.h_v, Modelica.Media.Water.IF97_Utilities.waterBaseProp_ph(evaporator.p, evaporator.h_v, 0, 0))
der(evaporator._rho_v) = Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.rhovl_p_der(evaporator.p, $cse1, der(evaporator.p))

Into:

SteamValve._state_a._d = Modelica.Media.Water.IF97_Utilities.waterBaseProp_ph(evaporator.p, evaporator.h_v, 0, 0).rho
der(evaporator._rho_v) = if $cse1.region3boundary then DIVISION(der(evaporator.p) - $cse1.pt * DIVISION(der(evaporator.p), $cse1.dpT), $cse1.pd) else (-$cse1.d ^ 2.0) * ($cse1.vp + DIVISION($cse1.vt, $cse1.dpT)) * der(evaporator.p)

I suppose it's a slight improvement. But there is still code duplicating the Modelica.Media.Water.IF97_Utilities.waterBaseProp_ph(evaporator.p, evaporator.h_v, 0, 0) call. Only these changes don't affect performance much, but adding a simplification for RSUB($cse10,rho) -> $cse10.rho does help slightly (~10% of total time); the only RSUB() expressions remaining then seem to be in initialization (and those only run once; although as I wrote above, those function calls do not become cse variables and could be slightly bad for performance).

I suppose pushing my changes would break the C++ code generator since we only used RSUB expressions in MetaModelica before, so I will fix that before making the PR.

comment:8 follow-up: Changed 7 years ago by sjoelund.se

A lot of time (~80%) is spent on the non-linear torn system, and the heaviest part of that is calls to things like:

(torn) der(evaporator.rho_v) := if Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.dewcurve_p(evaporator.p).region3boundary then DIVISION(der(evaporator.p) - Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.dewcurve_p(evaporator.p).pt * DIVISION(der(evaporator.p), Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.dewcurve_p(evaporator.p).dpT), Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.dewcurve_p(evaporator.p).pd) else (-Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.dewcurve_p(evaporator.p).d ^ 2.0) * (Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.dewcurve_p(evaporator.p).vp + DIVISION(Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.dewcurve_p(evaporator.p).vt, Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.dewcurve_p(evaporator.p).dpT)) * der(evaporator.p)

Yes, the same expression calls Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.dewcurve_p(evaporator.p) over and over again. And if I'm not mistaken, it could be calculated once before the NLS (because evaporator.p is known). But there is an if-expression involved and probably the CSE code is skipped for that (since if the call is in only one branch you shouldn't call it unless needed); but we could do the CSE if it is called in the condition or in both branches (and here both hold).

comment:9 Changed 7 years ago by sjoelund.se

For some reason my changes cause inline during initialization to generate a broken dependency graph, and we get (residual) 1.0 - /*Real*/ volume1.medium.T = 0 where the iteration variable is volume1.medium.h which is simply wrong.

comment:10 in reply to: ↑ 8 Changed 7 years ago by casella

Replying to sjoelund.se:

Yes, the same expression calls Modelica.Media.Water.IF97_Utilities.BaseIF97.Regions.dewcurve_p(evaporator.p) over and over again. And if I'm not mistaken, it could be calculated once before the NLS (because evaporator.p is known). But there is an if-expression involved and probably the CSE code is skipped for that (since if the call is in only one branch you shouldn't call it unless needed); but we could do the CSE if it is called in the condition or in both branches (and here both hold).

Sounds like a good idea.

comment:11 Changed 7 years ago by sjoelund.se

So my PR broke some tests...

Part of the problem seems to be with index reduction where d/dtime of the following expression is wrong:

1.184307920059215e-05 * Buildings.Fluid.Interfaces.ConservationEquation$volDyn$dynBal.Medium.ThermodynamicState(volDyn.dynBal.medium.p, 273.15 + volDyn.dynBal.hOut / 1014.54 + -24.65170914897392, {0.01, 0.99}).p
1.184307920059215e-05 * Buildings.Fluid.Interfaces.ConservationEquation$volDyn$dynBal.Medium.ThermodynamicState(0.0, 0.0, 0.0, 0.0).p

The derivative of an array argument to a function is not: add N arguments and pass them to the function...

comment:12 Changed 7 years ago by sjoelund.se

https://github.com/OpenModelica/OMCompiler/pull/2093 fixes the problem with differentiation changing {0,0} to 0,0, and also improves things by not needing to differentiate the record at all.

comment:13 Changed 7 years ago by sjoelund.se

  • Summary changed from LateInline annotation is not accounted for in Modelica.Media to Improve LateInline of if-statements and records

comment:8 is considered in https://github.com/OpenModelica/OMCompiler/pull/2094. I guess @casella should compare the drum boiler using the next OM nightly build and Dymola.

comment:14 Changed 7 years ago by sjoelund.se

Some fixed tests (+88, vs. previous -94).

Still broken, 6xThermoPower, and some performance tests that did not improve enough to trigger. Still good that only one library is failing for some tricky cases...

comment:15 Changed 7 years ago by sjoelund.se

Seems the ThermoPower models were not quite that badly affected. Since before Christmas there is no change, so I probably improved and then broke them again.

What is affected is models like: Modelica.Magnetic.QuasiStatic.FundamentalWave.Examples.BasicMachines.SynchronousMachines.SMPM_CurrentSource (reference to negative alias remaining during code generation, and would give wrong result if we didn't now abort)

Performance halved in:

ModelicaTest.Fluid.TestComponents.Machines.TestWaterPumpDCMotor
ModelicaTest.Fluid.TestComponents.Machines.TestWaterPumpDCMotorHeatTransfer
ModelicaTest.Fluid.TestExamplesVariants.IncompressibleFluidNetwork_StandardWaterOnePhase

comment:16 Changed 7 years ago by casella

  • Cc ptaeuber added

I tried this out with v1.13.0-dev-329-gf16fbfd on Windows 64 bit. The simulation time is now 0.21 s on my PC, compared to 0.063 using Dymola, which is about 3X slower. Unfortunately I did not record what was the slowdown factor when I opened the ticket, but it was much larger, if I recall correctly.

The original issue, i.e., a nonlinear system involving rhovl_p(), seems to be solved. There is still on nonlinear system to compute der(evaporator.p) and der(evaporator.Vl) that involves calling tsat_der() and some other computations.

The simulation takes 346 steps with 71 numerical jacobian computation; about half of the simulation time is spent solving a nonlinear loop involving the computation of tsat_der (which Dymola doesn't have), more precisely 22% computing the function and 27% solving the loop, which calls the function 12494 times.

From what I understand, this is no longer an issue of inlining or CSE. Rather, the system involving der(evaporator.p) and der(evaporator.Vl) is indeed linear, and should be solved with a linear solver, not with an iterative nonlinear solver. This is what Dymola actually does. Maybe the conditional equations are fooling the tearing algorithm somehow. Solving this system as a linear one should bring the performance very close to Dymola's.

@ptaeuber, could you have a look? The model is always the DrumBoiler example from Modelica.Fluid

comment:17 Changed 7 years ago by vitalij

The functionTree is missing for check linear or non-linear.
Related issues e.g. #3511

Last edited 7 years ago by vitalij (previous) (diff)

comment:18 Changed 7 years ago by ptaeuber

The linearity of the system is determined by the jacobian, which is missing here.
Same issue as in #4374.

comment:19 Changed 7 years ago by casella

  • Resolution set to fixed
  • Status changed from accepted to closed

OK, I just opened #4711 on this specific subject, as LateInline is no longer involved.

Note: See TracTickets for help on using tickets.