Opened 4 years ago

Last modified 3 years ago

#6196 assigned defect

Undetected high-index DAE or incorrectly handled overdetermined initialization in Buildings model (was: Bogus linear tearing in Buildings model)

Reported by: Francesco Casella Owned by: Karim Adbdelhak
Priority: blocker Milestone: 1.19.0
Component: Backend Version:
Keywords: Cc: Karim Adbdelhak, Michael Wetter

Description (last modified by Francesco Casella)

Please check Buildings.Electrical.AC.ThreePhasesUnbalanced.Validation.IEEETests.Test4NodesFeeder.BalancedStepDown.YD. The model fails during lambda0 initialization with

Failed to solve linear system of equations (no. 569) at time 0.000000, system is singular for U[2, 2].
Matrix U
        -0 
Output vector x
[ 1]                    0
The default linear solver fails, the fallback solver with total pivoting is started at time 0.000000. That might raise performance issues, for more information use -lv LOG_LS.

The debugger shows that system 569 is torn in this way:

linear (torn), unknowns: 3, iteration variables: 1 (transformer.conv2.terminal_p.i[1])
(torn) transformer.conv3.terminal_p.i[1] := transformer.conv2.terminal_p.i[1] - line2.i3[1]
(torn) transformer.conv1.terminal_p.i[1] := transformer.conv2.terminal_p.i[1] + line2.i2[1]
(residual) line2.i1[1] + transformer.conv1.terminal_p.i[1] - transformer.conv3.terminal_p.i[1] = 0

It is immediate to see that the term transformer.conv2.terminal_p.i[1] appears both in transformer.conv3.terminal_p.i[1] and in transformer.conv1.terminal_p.i[1], so when those two are subtracted to compute the residual, its sensitivity w.r.t. its unknown is zero.

I don't know what criterion is used to perform linear tearing, but it's obviously not a good one. Linear systems should be solved with proper pivoting; solving some equations according only to structural information but disregarding the coefficient values can be very dangerous.

Until this issue is solved, I would recommend to disable linear tearing by default. This will sure enough solve a lot of problems in the testsuite. BTW, now that we have sparse solvers to deal with large linear systems, the usefulness of linear tearing is probably quite limited.

Change History (25)

comment:1 by Francesco Casella, 4 years ago

Description: modified (diff)

comment:2 by Francesco Casella, 4 years ago

Milestone: 1.17.01.18.0

comment:3 by Andreas Heuermann, 4 years ago

Until this issue is solved, I would recommend to disable linear tearing by default. This will sure enough solve a lot of problems in the testsuite. BTW, now that we have sparse solvers to deal with large linear systems, the usefulness of linear tearing is probably quite limited.

I totally agree with you. I would even go a step further and don't spend time to improve linear tearing, but instead invest the time in sparse and parallelized (?) linear solvers to work better in the C and/or Cpp runtime.

Also linear tearing was disabled for a long time due to performance reasons. I'm not sure if anybody did a recent benchmark to compare linear tearing + dense linear solver against no linear tearing + sparse linear solver.
Maybe a nice task for a Bachelor thesis or student project?

in reply to:  3 comment:4 by Francesco Casella, 4 years ago

Replying to AnHeuermann:

Until this issue is solved, I would recommend to disable linear tearing by default. This will sure enough solve a lot of problems in the testsuite. BTW, now that we have sparse solvers to deal with large linear systems, the usefulness of linear tearing is probably quite limited.

I totally agree with you.

Can you make the required PR so we can have some test results before feature freeze? Mabye wait a few days before merging it, since we are currently trying out moving to NF by default, I'd try not to mix up the results.

I would even go a step further and don't spend time to improve linear tearing, but instead invest the time in sparse and parallelized (?) linear solvers to work better in the C and/or Cpp runtime.

Possibly. I guess using parallel solvers only makes sense for really big systems. How big I'm not sure, maybe you really need tens or hundreds of thousands variables to really see a difference.

Also linear tearing was disabled for a long time due to performance reasons.

I'm not sure what you mean here. We have --maxSizeLinearTearing = 200 by default, with the limit set because of performance. Do you mean it was set to zero and later increased to 200?

Anyway, I would just set --maxSizeLinearTearing = 0 by default. Next week please, to avoid confusion with test results.

I'm not sure if anybody did a recent benchmark to compare linear tearing + dense linear solver against no linear tearing + sparse linear solver.

I'm not aware of any.

Maybe a nice task for a Bachelor thesis or student project?

Why not? We have more pressing priorities, but this is an interesting subject anyway. The only problem I see is to have enough good test cases. One could use some test cases from the ScalableTestSuite library, but maybe they are somewhat artificial.

comment:5 by Andreas Heuermann, 4 years ago

Anyway, I would just set --maxSizeLinearTearing = 0 by default. Next week please, to avoid confusion with test results.

I think it was disabled slightly different. I would need to search for the exact commit. But yeah, disabling linear tearing for default simulation is no problem.
I'll create a PR in the next days.

in reply to:  5 comment:6 by Francesco Casella, 4 years ago

Replying to AnHeuermann:

I think it was disabled slightly different.

Yes, by using --disableLinearTearing. But that is said to be deprecated, so we should set the maximum size to zero instead :)

I'll create a PR in the next days.

Sounds good. You can actually create it right away, only put a comment to Adrian to wait pushing it until a moment when we are not playing around with the default NF or other major commits.

comment:7 by Francesco Casella, 4 years ago

A fix is pending in PR 6969, waiting to be merged onto master.

comment:8 by Andreas Heuermann, 4 years ago

Cc: Karim Adbdelhak added; Andreas Heuermann removed
Owner: changed from Karim Adbdelhak to Andreas Heuermann
Status: newassigned

comment:9 by Francesco Casella, 4 years ago

Milestone: 1.18.01.17.0

Fine to have this in 1.17.0, if possible.

comment:10 by Francesco Casella, 4 years ago

Milestone: 1.17.01.18.0

Rescheduled to 1.18.0

comment:11 by Francesco Casella, 4 years ago

Description: modified (diff)

comment:12 by Francesco Casella, 4 years ago

@AnHeuermann, maybe for the time being we can try to make strict tearing the default choice, so it avoids the divisions by zero that we are experiencing.

You can try a PR and we see what happens

Last edited 4 years ago by Francesco Casella (previous) (diff)

comment:13 by Francesco Casella, 4 years ago

Cc: Michael Wetter added
Resolution: fixed
Status: assignedclosed

comment:14 by Francesco Casella, 4 years ago

Resolution: fixed
Status: closedreopened

Sorry, I just wanted to add @mwetter in cc:, the issue is still open

comment:15 by Francesco Casella, 3 years ago

Owner: changed from Andreas Heuermann to Karim Adbdelhak
Status: reopenedassigned

Following our discussion at the devmeeting today, I double-checked the documentation of strict tearing, and it says

(Robust tearing rules by consideration of the partial derivative. Allows to divide by parameters that are not equal to or close to zero.)

There are indeed some cases, where equations are solved by dividing by parameters that have a default value of zero at compile time. See, e.g., #6232

However, the case reported here is different. In this case, there is no division by zero _in the torn equations_. What happens instead is that _the residual equations become singular_ because of a bad choice of tearing variables. I guess this may be more difficult to detect.

Since the singularity is structural, it could be detected at compile-time and invalidate the proposed tearing that lead to it, triggering a second (or a third) try.

What do you think?

Last edited 3 years ago by Francesco Casella (previous) (diff)

comment:16 by Francesco Casella, 3 years ago

@Karim, I looked again at this ticket. The original model that was reported with this issue, Buildings.Electrical.AC.ThreePhasesUnbalanced.Validation.IEEETests.Test4NodesFeeder.BalancedStepDown.YD, now fails during the backend with this error:

Warning: The linear system: 
1 : line2.v1_n[1] - line2.v1_p[1] = line2.Z11[1] * line2.i1[1] - line2.Z11[2] * line2.i1[2] + line2.Z12[1] * line2.i2[1] - line2.Z12[2] * line2.i2[2] + line2.Z13[1] * line2.i3[1] - line2.Z13[2] * line2.i3[2]
2 : loadRL.load1.v[1] = line2.v1_p[1] - line2.v2_p[1]
3 : line2.v2_n[1] - line2.v2_p[1] = line2.Z21[1] * line2.i1[1] - line2.Z21[2] * line2.i1[2] + line2.Z22[1] * line2.i2[1] - line2.Z22[2] * line2.i2[2] + line2.Z23[1] * line2.i3[1] - line2.Z23[2] * line2.i3[2]
4 : transformer.conv1.V2[1] = line2.v1_n[1] - line2.v2_n[1]
[
  1.0 , 0.0 , 0.0 , -1.0 ;
  0.0 , 0.0 , 1.0 , -1.0 ;
  0.0 , 1.0 , -1.0 , 0.0 ;
  -1.0 , 1.0 , 0.0 , 0.0
]
  *
[
  line2.v1_n[1] ;
  line2.v2_n[1] ;
  line2.v2_p[1] ;
  line2.v1_p[1]
]
  =
[
  line2.Z11[1] * line2.i1[1] + line2.Z13[1] * line2.i3[1] + line2.Z12[1] * line2.i2[1] - line2.Z11[2] * line2.i1[2] - line2.Z13[2] * line2.i3[2] - line2.Z12[2] * line2.i2[2] ;
  -loadRL.load1.v[1] ;
  line2.Z21[1] * line2.i1[1] + line2.Z23[1] * line2.i3[1] + line2.Z22[1] * line2.i2[1] - line2.Z21[2] * line2.i1[2] - line2.Z23[2] * line2.i3[2] - line2.Z22[2] * line2.i2[2] ;
  -transformer.conv1.V2[1]
]
 might be structurally or numerically singular for variable line2.v1_p[1] since 
 U(4,4) = 0.0. It might be hard to solve. Compilation continues anyway.

Indeed, if you sum the four columns of the A matrix, you get zero, i.e. the system is really singular. I guess this happens because the high-index nature of the system is not catched early on by resolveLoops and ASCC. What do you think?

Last edited 3 years ago by Francesco Casella (previous) (diff)

comment:17 by Francesco Casella, 3 years ago

Maybe the problem was not bogus tearing, but actually singular systems because of missing index reduction?

comment:18 by Francesco Casella, 3 years ago

BTW, I'm not sure why the behaviour changed, since I understand no PR has been merged so far regarding this issue.

comment:19 by Francesco Casella, 3 years ago

Milestone: 1.18.0

Ticket retargeted after milestone closed

comment:20 by Francesco Casella, 3 years ago

Milestone: 1.19.0

1.18.0 blocker tickets moved to 1.19.0

comment:21 by Francesco Casella, 3 years ago

Summary: Bogus linear tearing in Buildings modelUndetected high-index DAE in Buildings model (was: Bogus linear tearing in Buildings model)

comment:22 by Francesco Casella, 3 years ago

@Karim, this looks like a case where neither the ASSC nor the reshuffleLoops algorithm can figure out the hidden constraint between the three-phase currents, hence a singularity when trying to compute the voltages ensues.

comment:23 by Francesco Casella, 3 years ago

@Karim reports there are two possible causes of this issue: either the singularity is not caught by Pantelides/ASSC/Reshuffle loops algorithms, or the initialization problem is still singular for some reason.

In any case, this looks like an individual fringe case, which is best solved in the new backend - it is probably not worth the effort to try to fix this in the old backend, which is going to be retired in one-two years from now.

comment:24 by Francesco Casella, 3 years ago

Summary: Undetected high-index DAE in Buildings model (was: Bogus linear tearing in Buildings model)Undetected high-index DAE or incorrectly handled overdetermined initialization in Buildings model (was: Bogus linear tearing in Buildings model)

comment:25 by Francesco Casella, 3 years ago

The discussion continues in the corresponding GitHub ticket.

Last edited 3 years ago by Francesco Casella (previous) (diff)
Note: See TracTickets for help on using tickets.