Opened 8 years ago
Closed 7 years ago
#3816 closed defect (fixed)
Matching and sorting scales as O(N^3) in models with large numbers of discrete variables
Reported by: | casella | Owned by: | ptaeuber |
---|---|---|---|
Priority: | critical | Milestone: | 1.12.0 |
Component: | Backend | Version: | |
Keywords: | Cc: |
Description
Consider the models BreakerNetwork_N_XX in the ScalableTestSuite test results. These are the times spent by prepare postOptimizeDAE
N | time |
---|---|
160 | 0.514 |
320 | 4.7 |
640 | 49 |
1280 | 436 |
which is O(N3).
Note that in other similar large models, but lacking discrete variables, the scaling is roughly linear, see, e.g., DistributionSystemModelica_N_XX_M_XX
NxM | time |
---|---|
196 | 0.138 |
400 | 0.652 |
784 | 0.414 |
1600 | 0.883 |
Change History (13)
comment:1 Changed 8 years ago by casella
- Component changed from *unknown* to Backend
- Milestone changed from Future to 1.10.0
- Owner changed from somebody to lochel
- Priority changed from high to critical
comment:2 follow-up: ↓ 3 Changed 8 years ago by hkiel
comment:3 in reply to: ↑ 2 Changed 8 years ago by casella
Replying to hkiel:
The culprit is mapSortEqnsDAE (Tarjan's algorithm)
I understand that the complexity of Tarjan's algorithm is linear with the number of nodes and edges in the graph, so this should scale as O(N). I guess there's something very inefficient in the algorithm's implementation.
comment:4 Changed 8 years ago by casella
In the meantime the performance reported by execstat on Hudson has improved by a constant factor around 2.5 w.r.t. the values reported when I opened this ticket, but it is still O(N3).
comment:5 Changed 7 years ago by sjoelund.se
- Milestone changed from 1.10.0 to 1.11.0
Ticket retargeted after milestone closed
comment:6 in reply to: ↑ description Changed 7 years ago by casella
Status update, with results from libraries.openmodelica.org
N | time |
---|---|
160 | 0.4 |
320 | 3.6 |
640 | 27.9 |
1280 | 237.3 |
which is still O(N3).
comment:7 Changed 7 years ago by sjoelund.se
- Milestone changed from 1.11.0 to 1.12.0
Milestone moved to 1.12.0 due to 1.11.0 already being released.
comment:8 Changed 7 years ago by casella
- Owner changed from lochel to ptaeuber
- Status changed from new to assigned
@ptaeuber, could you have a look at this? I noticed that (according to @hkiel's comment), this large amount of time is also spent by matching and sorting.
Maybe there is something wrong (or grossly naive) in how we deal with the sorting of hybrid systems? This phenomenon only takes place when there are many events and discrete variables involved.
Thanks!
comment:9 Changed 7 years ago by casella
- Summary changed from Prepare PostOptimizeDAE scales as O(N^3) in models with large numbers of discrete variables to Matching and sorting scales as O(N^3) in models with large numbers of discrete variables
BTW, the BreakerNetwork_N_XX models show similar times also for tearingSystem. I guess the root cause is the same, please have a look at that also.
I am changing the summary to reflect more precisely what the problem is.
comment:10 Changed 7 years ago by ptaeuber
- Status changed from assigned to accepted
comment:11 Changed 7 years ago by hkiel
Most of the time is spent in Differentiate.differentiateExp() for big models:
N | time tearing | time tearing w/o LC | ||
---|---|---|---|---|
(init) | (sim) | (init) | (sim) | |
320 | 3.86 | 4.04 | 1.44 | 1.41 |
640 | 33.7 | 33.3 | 7.97 | 7.27 |
1280 | 281.6 | 275.9 | 46.85 | 42.3 |
Most of the time is spent for loop checking in the differentiation. So with removed loop check (w/o LC) times are very much lower. See #4454. I would go for removing the check completely as proposed in PR1698.
comment:12 Changed 7 years ago by ptaeuber
Since removing the loop check much less time is spent for prepare postOptimizeDAE (which is actually the matching and sorting of the initial system) as well:
N | time (prepare postOptimizeDAE) |
320 | 0.226 |
640 | 0.6898 |
1280 | 2.188 |
comment:13 Changed 7 years ago by casella
- Resolution set to fixed
- Status changed from accepted to closed
Excellent result, thanks for your efforts! A two orders of magnitude improvement on the largest model sounds really impressive :)
I guess I can now close this ticket for good.
The culprit is mapSortEqnsDAE (Tarjan's algorithm)
It would be worth looking at that, because I see it called 27 times during simulation where three calls have considerable running times (one is in the prepare postOptimizeDAE phase)