Opened 9 years ago

Last modified 6 years ago

#3678 new enhancement

Efficient flattening (and code generation) for large-scale network models

Reported by: Francesco Casella Owned by: Per Östlund
Priority: high Milestone: Future
Component: *unknown* Version:
Keywords: Cc: Andrea Bartolini

Description

I have tried to summarize in a representative model the features of large-scale power system models that are currently stressing the compiler performance in terms of code generation time.

Please have a look at the attached test package.

The basic model ResistorSource is a resistor with a voltage controlled-source in series, whose voltage is determined by a sub-model of type FirstOrder, containing a first order linear system. The forcing signal of the first order system is by default bound to zero, but it can be changed with a binding equations when instantiating the ResistorSource model. Many basic models are instantiated and connected together; some have a modifier on the binding equation, some don't.

SystemSmall shows a simple example, while GenerateSystemLarge automatically generates the Modelica source code for systems of arbitrary size. With the default parameters, the large system has 40000 equations, and the front-end takes 43 seconds to flatten it on my pc. If you want to go up to the scale we need, you can multiply the size by a factor 10-20, though, to use Per's words, life's too short to study that case with the current compiler :)

We need to show that there are strategies that can be implemented in OMC in order to shorten this processing time drastically.

Regarding the front-end, I understand from Adrian that it would be possible to use some caching strategy, to avoid re-doing all the lookup times and again for each instance. Basically, the first time the compiler encounters this declaration

  TestCaching.ResistorSource rs_1(R = 1, u = sin(time));

it will do all the lookup and flattening for the ResistorSource class with a time-varying binding equation on u, and a constant binding equation on R. This structure could be labelled as a specific type. Then, the next time the compiler encounters another declaration with exactly the same type (except for numerical values of parameters!), it could avoid re-doing the flattening and instantiation again, and just use the cached results.

Would it be possible to come up with a prototype in the front-end that can use this strategy to process the SystemLarge model? It would then be very interesting to compare its performance with the currently available one, and understand if this kind of strategy pays off.

A second further stage could involve the back-end. Assuming we use a native DAE solver (IDA/Kinsol), then we could somehow avoid to actually flatten all the individual instances in the front end, and instead pass to the back-end the collected types and just pointers to the various instances. Code to compute the DAE residuals could be generated only once for each type, and then just be called many times, one for each instance. This could save a tremendous amount of time and space which is currently spent generating the same code N times, and then compiling it to executable form.

Note that if we use a DAE integrator and a-priori knowledge about the system, we could skip most of the current back-end stages: matching and sorting, index reduction, etc.

I understand this second step is a lot more far-fetched, but would it be possible to come up with a prototype that could run on this demo example, to gauge the performance improvements?

Attachments (1)

TestCaching.mo (2.5 KB ) - added by Francesco Casella 9 years ago.

Download all attachments as: .zip

Change History (8)

by Francesco Casella, 9 years ago

Attachment: TestCaching.mo added

in reply to:  description ; comment:1 by Per Östlund, 9 years ago

Replying to casella:

Regarding the front-end, I understand from Adrian that it would be possible to use some caching strategy, to avoid re-doing all the lookup times and again for each instance. Basically, the first time the compiler encounters this declaration

  TestCaching.ResistorSource rs_1(R = 1, u = sin(time));

it will do all the lookup and flattening for the ResistorSource class with a time-varying binding equation on u, and a constant binding equation on R. This structure could be labelled as a specific type. Then, the next time the compiler encounters another declaration with exactly the same type (except for numerical values of parameters!), it could avoid re-doing the flattening and instantiation again, and just use the cached results.

I've thought about such things many times, but it would be very hard to get it right. You'd need to apply the modifiers on the cached instance, which would involve replacing e.g. any occurence of R with 1. But the current front end is very eager to constant evaluate anything it can (and it's very hard to change that), and you can't replace R if it's already been replaced with a value. And if you have redeclares it becomes even harder. So in the general case it's pretty much impossible, but it might be possible to make something that works with the constraints of your particular models.

comment:2 by Francesco Casella, 9 years ago

I understand that this cannot work in general if you start having redeclares and complex, nested structures, but I think we can exploit the fact that these models, though large, have a simple and regular structure. Other models share this property, e.g. in the field of buildings thermal simulation, which is another case where old-fashioned software still rules, Modelica is currently considered for replacement, but slow code generation is scaring off people. This is not a niche market: the DoE estimated that the total turnaround related to the EnergyPlus over its life span was around one billion dollars (!). Imagine what would it mean for us if Modelica technology is chosen to be the foundation of the next-generation EnergyPlus.

We could even think of introducing some annotations, e.g. to avoid constant evaluation or to trigger appropriate caching. I don't see anything wrong with this, as the annotations do not really change the semantics, but just give hints to the compiler to get better performance. Of course, if you want to handle behemot-sized models, you might need to pay some price.

The alternative is something extremely ugly, which is currently being considered by many: use OMC to generate code to compute the residuals of individual classes, and then write ad-hoc C-code that calls this code many times, allocate memory, stitches all this information together, assembles the jacobians and residuals, and calls a DAE solver. Which means, all of a sudden, we're back to the early '90s (or worse), with people writing procedural code to solve their very specific problem, instead of trying to do this once and for all in general-purpose OO modelling tools.

I think it would be much better to embed some problem-specific optimizations in the compiler, which could be useful for different classes of problems anyway, and take advantage of the high-level modelling approach, rather than going that way.

comment:3 by Francesco Casella, 8 years ago

Adrian/Per: will the upcoming new front-end address (at least partially) this issue?

in reply to:  3 comment:4 by Per Östlund, 8 years ago

Replying to casella:

Adrian/Per: will the upcoming new front-end address (at least partially) this issue?

Yes, partially at least. The new instantiation tries to do things only once if it's possible, so it does some partial instantiation of classes only once and reuses that when doing the full instantiation. So there's some caching built into the instantiation itself.

It does not fully instantiate classes and then try to apply modifiers to them however, since that would be far too complicated at this stage.

comment:5 by Francesco Casella, 8 years ago

The test models in the ScalableTestSuite.Electrical.DistributionSystemAC package can be used to test this concept on a somewhat realistic use case.

Models marked with the __OpenModelica_lateFlatten = true are eventually instantiated many times, with only changes on parameters. There are several things that can be done with them.

  1. Identify all the instances of each of those classes and only actually instantiate it once
  2. Run symbolic optimizations (e.g. alias elimination, CSE) on this common virtual instance, before actually instantiating it many times
  3. Assuming the model is index 1 and a DAE solver is used, one could actually skip the phase where this model is instantiated multiple times (which eventually results in C code with thousands of repetitions). Instead, one could generate the C code for the virtual instance only once, and then call it N times in the runtime to generate the residuals of the DAEs of all its actual instances. This will basically end up with a back-end and code generation phase that scales as O(1), which would be great for very large index 1 models.

comment:6 by Francesco Casella, 8 years ago

Dirk Zimmer just pointed me to a paper he presented at the Modelica Conference in Como on this topic.

Although most of the contribution is devoted to causalized systems and to automatic partitioning (which are both going beyond what I am proposing here) I think it is an interesting and absolutely relevant reading, worth considering befor we actually start this work.

in reply to:  1 comment:7 by Francesco Casella, 6 years ago

Replying to perost:

I've thought about such things many times, but it would be very hard to get it right. You'd need to apply the modifiers on the cached instance, which would involve replacing e.g. any occurence of R with 1. But the current front end is very eager to constant evaluate anything it can (and it's very hard to change that)

I understand the new front end no longer follows this approach.

and you can't replace R if it's already been replaced with a value. And if you have redeclares it becomes even harder.

We may limit the approach to models without redeclare. This already gives a lot of headroom for practical applications.

Note: See TracTickets for help on using tickets.