Opened 8 years ago
Last modified 3 years ago
#4270 reopened defect
Sorting of Eqns and Variable in Backend
Reported by: | Willi Braun | Owned by: | Willi Braun |
---|---|---|---|
Priority: | high | Milestone: | |
Component: | Backend | Version: | |
Keywords: | Cc: | francois.beaude@…, Francesco Casella, Bernhard Bachmann, vruge, Lennart Ochel, Patrick Täuber |
Description
In the attached script the simulation result of the model A and model B is different due to a different order of components in model definition. This may be caused by the modules removeSimpleEquation or Tearing (or maybe something else ?), since this modules based on heuristics that have a different output on different input.
Also connected to #3397.
We already have a modules that can fix the issue in the attached model --preOptModules+=sortEqnsVars
. This pre-optimization module currently sorts the equations and variables by there amount of appearance in the adjacency matrix. This might be still not a fully deterministic behavior, so we probably need further criteria.
Should this modules become enabled by default?
Attachments (4)
Change History (21)
by , 8 years ago
comment:1 by , 8 years ago
Type: | defect → discussion |
---|
comment:2 by , 8 years ago
Type: | discussion → defect |
---|
This is also a defect! (see the attached test case)
follow-up: 7 comment:3 by , 8 years ago
Are you 100% sure that models A and B are exactly the same, except for the ordering?
That said, although there might be some inherent non-determinism in the success of the nonlinear solvers depending on the ordering, it seems reasonable to me to avoid a random behaviour of the tool from one version to the next by enforcing some fixed (though arbitrary) sorting of variables.
I would bear in mind two things:
- sorting takes time, so please make sure you use state-of-the-art sorting algorithms with O(N log(N)) complexity and efficient implementation
- I suspect that for large models, sorting can also heavily impact the simulation performance because of more or less cache misses when evaluating the residuals and the Jacobians. It would be good to study this phenomenon and introduce some good heuristics for the ordering, that maximize the exploitation of modern CPU pipelining and caching by the generated code
follow-up: 6 comment:4 by , 8 years ago
wbraun wrote:
Should this modules become enabled by default?
No, the module sortEqnsVars
shouldn't be enabled by default. Instead you should investigate the heuristics that depend on ordering, e.g. tearing. Based on this analysis it might be reasonable to apply a sorting to certain strongly connected components, but not to the entire system.
comment:5 by , 8 years ago
Cc: | added |
---|
comment:6 by , 8 years ago
Replying to lochel:
wbraun wrote:
Should this modules become enabled by default?
No, the module
sortEqnsVars
shouldn't be enabled by default. Instead you should investigate the heuristics that depend on ordering, e.g. tearing. Based on this analysis it might be reasonable to apply a sorting to certain strongly connected components
There a lot of issues to adapt the heuristic see e.g. lochel_Don't_apply_removeSimpleEquations_to_clocked partitions. Where it was not possible to extends the heuristic.
How do you plan to make the modul more powerful and keep maintainability?
but not to the entire system.
This is not correct. If the order has impact on removeSimpleEquation then it has possible impact of the loop-equation, too.
Francesco is right. We can get performance-issues and we should possible add limit for number of equations.
Additional deterministic behavior can give some speed up in developer process :)
comment:7 by , 8 years ago
Replying to casella:
Are you 100% sure that models A and B are exactly the same, except for the ordering?
Yes, I'm. The output of instantiateModel is beside the ordering 100% identical.
Replying to casella:
It would be good to study this phenomenon and introduce some good heuristics for the ordering, that maximize the exploitation of modern CPU pipelining and caching by the generated code.
For this kind of sorting it is needed to take the BLT sorting also into account, and this has an other purpose as the one suggested above, but you are right this would be also worth to investigate further.
follow-up: 9 comment:8 by , 8 years ago
I had a look into the models and found out that the different results are related to initialization (and not to any other heuristical module, in this case).
The underdetermined initialization problem leads to different equations for the two models, because of the different order:
iPSL_ATotal:
Warning: Assuming fixed start value for the following 6 variables: gover.imLeadLag.TF.x_scaled[1]:VARIABLE(protected = true ) "Scaled vector x" type: Real [1] MS.theta:VARIABLE(start = MS.init_theta ) type: Real MS.lambdaq2:VARIABLE(start = MS.init_lambdaq2 ) type: Real MS.lambdaq1:VARIABLE(start = MS.init_lambdaq1 ) type: Real MS.lambdad:VARIABLE(start = MS.init_lambdad ) type: Real MS.lambdaf:VARIABLE(start = MS.init_lambdaf ) type: Real
iPSL_BTotal (this model leads to the expected results):
Warning: Assuming fixed start value for the following 6 variables: MS.omega:VARIABLE(start = MS.init_omega ) type: Real MS.theta:VARIABLE(start = MS.init_theta ) type: Real MS.lambdaq2:VARIABLE(start = MS.init_lambdaq2 ) type: Real MS.lambdaq1:VARIABLE(start = MS.init_lambdaq1 ) type: Real MS.lambdad:VARIABLE(start = MS.init_lambdad ) type: Real MS.lambdaf:VARIABLE(start = MS.init_lambdaf ) type: Real
The problem is that there is no given start value for gover.imLeadLag.TF.x_scaled[1], so 0 is chosen by default.
Using the initial results from model B to simulate model A leads to the same results for the whole simulation.
Selecting a better start value for gover.imLeadLag.TF.x_scaled[1] (e.g. 0.1) also leads to the expected results.
So shouldn't there be a criteria to preferably choose variables with a given start value to create initial equations?
comment:9 by , 8 years ago
Replying to ptaeuber:
I had a look into the models and found out that the different results are related to initialization (and not to any other heuristical module, in this case).
Shame on me, I should have seen this from the very beginning :)
All models with not fully specified initial conditions (although legal in Modelica) are time bombs. I have encountered too many in my experience and wasted too much time before understanding that the root cause of the problem was not enough inititial equations.
I'll open a ticket on this topic.
comment:10 by , 8 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
Patrick, thanks for the good investigation. I will improve the warnings according to Francesco's suggestion.
comment:11 by , 8 years ago
Resolution: | invalid |
---|---|
Status: | closed → reopened |
Not so fast, to consider this ticket as invalid is really a hasty conclusion!
The original test case show just an other heuristic in the backend, on that I also haven't thought in advanced. And in my option it's still a bug, if the user just swap a couple of components in the model definition and the system, starts or stops, simulating
Furthermore, consider the attached test-case "testMSL.mos". It contains two models, where RevolutePlanarLoopConstraintB
is actually ModelicaTest.MultiBody.Joints.RevolutePlanarLoopConstraint
and in RevolutePlanarLoopConstraintA
I've just swap some component definitions.
While RevolutePlanarLoopConstraintA
is start working, but ModelicaTest.MultiBody.Joints.RevolutePlanarLoopConstraint
fails in our coverage tests.
This case it is related to the tearing module as far as I see, or maybe to the stateSelection, which is an further source of non-deterministic behavior.
by , 8 years ago
Attachment: | testMSL.mos added |
---|
ModelicaTest.MultiBody.Joints.RevolutePlanarLoopConstraint
comment:12 by , 7 years ago
Milestone: | 1.12.0 → 1.13.0 |
---|
Milestone moved to 1.13.0 due to 1.12.0 already being released.
comment:14 by , 5 years ago
Milestone: | 1.14.0 → 1.16.0 |
---|
Releasing 1.14.0 which is stable and has many improvements w.r.t. 1.13.2. This issue is rescheduled to 1.16.0
comment:16 by , 4 years ago
Milestone: | 1.17.0 → 1.18.0 |
---|
Retargeted to 1.18.0 because of 1.17.0 timed release.
test sript