Opened 3 years ago

Last modified 2 years ago

#6196 assigned defect

Undetected high-index DAE or incorrectly handled overdetermined initialization in Buildings model (was: Bogus linear tearing in Buildings model)

Reported by: casella Owned by: Karim.Abdelhak
Priority: blocker Milestone: 1.19.0
Component: Backend Version:
Keywords: Cc: Karim.Abdelhak, mwetter

Description (last modified by casella)

Please check Buildings.Electrical.AC.ThreePhasesUnbalanced.Validation.IEEETests.Test4NodesFeeder.BalancedStepDown.YD. The model fails during lambda0 initialization with

Failed to solve linear system of equations (no. 569) at time 0.000000, system is singular for U[2, 2].
Matrix U
        -0 
Output vector x
[ 1]                    0
The default linear solver fails, the fallback solver with total pivoting is started at time 0.000000. That might raise performance issues, for more information use -lv LOG_LS.

The debugger shows that system 569 is torn in this way:

linear (torn), unknowns: 3, iteration variables: 1 (transformer.conv2.terminal_p.i[1])
(torn) transformer.conv3.terminal_p.i[1] := transformer.conv2.terminal_p.i[1] - line2.i3[1]
(torn) transformer.conv1.terminal_p.i[1] := transformer.conv2.terminal_p.i[1] + line2.i2[1]
(residual) line2.i1[1] + transformer.conv1.terminal_p.i[1] - transformer.conv3.terminal_p.i[1] = 0

It is immediate to see that the term transformer.conv2.terminal_p.i[1] appears both in transformer.conv3.terminal_p.i[1] and in transformer.conv1.terminal_p.i[1], so when those two are subtracted to compute the residual, its sensitivity w.r.t. its unknown is zero.

I don't know what criterion is used to perform linear tearing, but it's obviously not a good one. Linear systems should be solved with proper pivoting; solving some equations according only to structural information but disregarding the coefficient values can be very dangerous.

Until this issue is solved, I would recommend to disable linear tearing by default. This will sure enough solve a lot of problems in the testsuite. BTW, now that we have sparse solvers to deal with large linear systems, the usefulness of linear tearing is probably quite limited.

Change History (25)

comment:1 Changed 3 years ago by casella

  • Description modified (diff)

comment:2 Changed 3 years ago by casella

  • Milestone changed from 1.17.0 to 1.18.0

comment:3 follow-up: Changed 3 years ago by AnHeuermann

Until this issue is solved, I would recommend to disable linear tearing by default. This will sure enough solve a lot of problems in the testsuite. BTW, now that we have sparse solvers to deal with large linear systems, the usefulness of linear tearing is probably quite limited.

I totally agree with you. I would even go a step further and don't spend time to improve linear tearing, but instead invest the time in sparse and parallelized (?) linear solvers to work better in the C and/or Cpp runtime.

Also linear tearing was disabled for a long time due to performance reasons. I'm not sure if anybody did a recent benchmark to compare linear tearing + dense linear solver against no linear tearing + sparse linear solver.
Maybe a nice task for a Bachelor thesis or student project?

comment:4 in reply to: ↑ 3 Changed 3 years ago by casella

Replying to AnHeuermann:

Until this issue is solved, I would recommend to disable linear tearing by default. This will sure enough solve a lot of problems in the testsuite. BTW, now that we have sparse solvers to deal with large linear systems, the usefulness of linear tearing is probably quite limited.

I totally agree with you.

Can you make the required PR so we can have some test results before feature freeze? Mabye wait a few days before merging it, since we are currently trying out moving to NF by default, I'd try not to mix up the results.

I would even go a step further and don't spend time to improve linear tearing, but instead invest the time in sparse and parallelized (?) linear solvers to work better in the C and/or Cpp runtime.

Possibly. I guess using parallel solvers only makes sense for really big systems. How big I'm not sure, maybe you really need tens or hundreds of thousands variables to really see a difference.

Also linear tearing was disabled for a long time due to performance reasons.

I'm not sure what you mean here. We have --maxSizeLinearTearing = 200 by default, with the limit set because of performance. Do you mean it was set to zero and later increased to 200?

Anyway, I would just set --maxSizeLinearTearing = 0 by default. Next week please, to avoid confusion with test results.

I'm not sure if anybody did a recent benchmark to compare linear tearing + dense linear solver against no linear tearing + sparse linear solver.

I'm not aware of any.

Maybe a nice task for a Bachelor thesis or student project?

Why not? We have more pressing priorities, but this is an interesting subject anyway. The only problem I see is to have enough good test cases. One could use some test cases from the ScalableTestSuite library, but maybe they are somewhat artificial.

comment:5 follow-up: Changed 3 years ago by AnHeuermann

Anyway, I would just set --maxSizeLinearTearing = 0 by default. Next week please, to avoid confusion with test results.

I think it was disabled slightly different. I would need to search for the exact commit. But yeah, disabling linear tearing for default simulation is no problem.
I'll create a PR in the next days.

comment:6 in reply to: ↑ 5 Changed 3 years ago by casella

Replying to AnHeuermann:

I think it was disabled slightly different.

Yes, by using --disableLinearTearing. But that is said to be deprecated, so we should set the maximum size to zero instead :)

I'll create a PR in the next days.

Sounds good. You can actually create it right away, only put a comment to Adrian to wait pushing it until a moment when we are not playing around with the default NF or other major commits.

comment:7 Changed 3 years ago by casella

A fix is pending in PR 6969, waiting to be merged onto master.

comment:8 Changed 3 years ago by AnHeuermann

  • Cc Karim.Abdelhak added; AnHeuermann removed
  • Owner changed from Karim.Abdelhak to AnHeuermann
  • Status changed from new to assigned

comment:9 Changed 3 years ago by casella

  • Milestone changed from 1.18.0 to 1.17.0

Fine to have this in 1.17.0, if possible.

comment:10 Changed 3 years ago by casella

  • Milestone changed from 1.17.0 to 1.18.0

Rescheduled to 1.18.0

comment:11 Changed 3 years ago by casella

  • Description modified (diff)

comment:12 Changed 3 years ago by casella

@AnHeuermann, maybe for the time being we can try to make strict tearing the default choice, so it avoids the divisions by zero that we are experiencing.

You can try a PR and we see what happens

Last edited 3 years ago by casella (previous) (diff)

comment:13 Changed 3 years ago by casella

  • Cc mwetter added
  • Resolution set to fixed
  • Status changed from assigned to closed

comment:14 Changed 3 years ago by casella

  • Resolution fixed deleted
  • Status changed from closed to reopened

Sorry, I just wanted to add @mwetter in cc:, the issue is still open

comment:15 Changed 3 years ago by casella

  • Owner changed from AnHeuermann to Karim.Abdelhak
  • Status changed from reopened to assigned

Following our discussion at the devmeeting today, I double-checked the documentation of strict tearing, and it says

(Robust tearing rules by consideration of the partial derivative. Allows to divide by parameters that are not equal to or close to zero.)

There are indeed some cases, where equations are solved by dividing by parameters that have a default value of zero at compile time. See, e.g., #6232

However, the case reported here is different. In this case, there is no division by zero _in the torn equations_. What happens instead is that _the residual equations become singular_ because of a bad choice of tearing variables. I guess this may be more difficult to detect.

Since the singularity is structural, it could be detected at compile-time and invalidate the proposed tearing that lead to it, triggering a second (or a third) try.

What do you think?

Last edited 3 years ago by casella (previous) (diff)

comment:16 Changed 3 years ago by casella

@Karim, I looked again at this ticket. The original model that was reported with this issue, Buildings.Electrical.AC.ThreePhasesUnbalanced.Validation.IEEETests.Test4NodesFeeder.BalancedStepDown.YD, now fails during the backend with this error:

Warning: The linear system: 
1 : line2.v1_n[1] - line2.v1_p[1] = line2.Z11[1] * line2.i1[1] - line2.Z11[2] * line2.i1[2] + line2.Z12[1] * line2.i2[1] - line2.Z12[2] * line2.i2[2] + line2.Z13[1] * line2.i3[1] - line2.Z13[2] * line2.i3[2]
2 : loadRL.load1.v[1] = line2.v1_p[1] - line2.v2_p[1]
3 : line2.v2_n[1] - line2.v2_p[1] = line2.Z21[1] * line2.i1[1] - line2.Z21[2] * line2.i1[2] + line2.Z22[1] * line2.i2[1] - line2.Z22[2] * line2.i2[2] + line2.Z23[1] * line2.i3[1] - line2.Z23[2] * line2.i3[2]
4 : transformer.conv1.V2[1] = line2.v1_n[1] - line2.v2_n[1]
[
  1.0 , 0.0 , 0.0 , -1.0 ;
  0.0 , 0.0 , 1.0 , -1.0 ;
  0.0 , 1.0 , -1.0 , 0.0 ;
  -1.0 , 1.0 , 0.0 , 0.0
]
  *
[
  line2.v1_n[1] ;
  line2.v2_n[1] ;
  line2.v2_p[1] ;
  line2.v1_p[1]
]
  =
[
  line2.Z11[1] * line2.i1[1] + line2.Z13[1] * line2.i3[1] + line2.Z12[1] * line2.i2[1] - line2.Z11[2] * line2.i1[2] - line2.Z13[2] * line2.i3[2] - line2.Z12[2] * line2.i2[2] ;
  -loadRL.load1.v[1] ;
  line2.Z21[1] * line2.i1[1] + line2.Z23[1] * line2.i3[1] + line2.Z22[1] * line2.i2[1] - line2.Z21[2] * line2.i1[2] - line2.Z23[2] * line2.i3[2] - line2.Z22[2] * line2.i2[2] ;
  -transformer.conv1.V2[1]
]
 might be structurally or numerically singular for variable line2.v1_p[1] since 
 U(4,4) = 0.0. It might be hard to solve. Compilation continues anyway.

Indeed, if you sum the four columns of the A matrix, you get zero, i.e. the system is really singular. I guess this happens because the high-index nature of the system is not catched early on by resolveLoops and ASCC. What do you think?

Last edited 3 years ago by casella (previous) (diff)

comment:17 Changed 3 years ago by casella

Maybe the problem was not bogus tearing, but actually singular systems because of missing index reduction?

comment:18 Changed 3 years ago by casella

BTW, I'm not sure why the behaviour changed, since I understand no PR has been merged so far regarding this issue.

comment:19 Changed 3 years ago by casella

  • Milestone 1.18.0 deleted

Ticket retargeted after milestone closed

comment:20 Changed 3 years ago by casella

  • Milestone set to 1.19.0

1.18.0 blocker tickets moved to 1.19.0

comment:21 Changed 3 years ago by casella

  • Summary changed from Bogus linear tearing in Buildings model to Undetected high-index DAE in Buildings model (was: Bogus linear tearing in Buildings model)

comment:22 Changed 3 years ago by casella

@Karim, this looks like a case where neither the ASSC nor the reshuffleLoops algorithm can figure out the hidden constraint between the three-phase currents, hence a singularity when trying to compute the voltages ensues.

comment:23 Changed 3 years ago by casella

@Karim reports there are two possible causes of this issue: either the singularity is not caught by Pantelides/ASSC/Reshuffle loops algorithms, or the initialization problem is still singular for some reason.

In any case, this looks like an individual fringe case, which is best solved in the new backend - it is probably not worth the effort to try to fix this in the old backend, which is going to be retired in one-two years from now.

comment:24 Changed 3 years ago by casella

  • Summary changed from Undetected high-index DAE in Buildings model (was: Bogus linear tearing in Buildings model) to Undetected high-index DAE or incorrectly handled overdetermined initialization in Buildings model (was: Bogus linear tearing in Buildings model)

comment:25 Changed 2 years ago by casella

The discussion continues in the corresponding GitHub ticket.

Last edited 2 years ago by casella (previous) (diff)
Note: See TracTickets for help on using tickets.