Opened 7 years ago

Last modified 3 years ago

#4845 assigned defect

Tearing of linear systems produces singular system out of moderately-sized, well-posed models

Reported by: Francesco Casella Owned by: Andreas Heuermann
Priority: high Milestone:
Component: Backend Version:
Keywords: Cc: Karim Adbdelhak

Description

Please consider the ScalableTestSuite.Mechanical.HarmmonicOscillator.ScaledExperiments.HarmonicOscillatorNetwork_N_XX test models. They require solving a linear system of N equations to compute the accelerations of the N point masses involved. For N > 40, the solver fails repeatedly with error messages like:

Failed to solve linear system of equations (no. 322) at time 0.000000. 
Residual norm is 14.0094302893603.
The default linear solver fails, the fallback solver with total pivoting
is started at time 0.000000. That might raise performance issues, for
more information use -lv LOG_LS

In fact, the outcome is particularly bad, because the solver does not abort, but keeps on trying forever. Even if the Cancel simulation button is pressed, the .exe file keeps on running in the background unless it is killed with the Process Manager, which is quite bad.

The debugger reveals that the linear system of size N is very effectively torn, ending up with just one tearing variable. Unfortunately, the resulting torn system is ill-conditioned for even moderate values of N.

From what I understand, the problem is that the k+1-th torn variable is given by 3 times the k-th one, plus other terms. As a consequence, the N-th torn variable is ultimately depending on 3N times the tearing variable, which is obviously not going to be numerically well-posed for N much larger than 20.

The system per se is well-posed and solved without problems also for much larger sizes if tearing is switched off, and possibly a sparse solver is used for large values of N.

Clearly, tearing is not a good idea to solve this class of systems. I think we need some mechanism to identify them and prevent (or limit) the use of tearing in such cases.. The current behaviour, i.e., get an ill-posed system and failing badly, is not acceptable.

BTW, note that I did not build this test cases specifically to cause this outcome.

Change History (12)

comment:1 by Lennart Ochel, 7 years ago

Owner: Lennart Ochel removed
Status: newassigned

comment:2 by Francesco Casella, 7 years ago

Cc: Lennart Ochel added

@lochel why did you remove yourself as owner?

comment:3 by Lennart Ochel, 7 years ago

I don't have time to work on it myself and since Patrick left the dev team, I don't know to whom this should be assigned.
So I removed myself to indicate that I cannot take care of this issue.

comment:4 by Francesco Casella, 7 years ago

OK, it looked strange that the ticket was assigned to no-one, but now I understand you can remove yourself while leaving the ticket unassigned.

comment:5 by Francesco Casella, 6 years ago

Milestone: 1.13.01.14.0

Rescheduled to 1.14.0 after 1.13.0 releasee

comment:6 by Francesco Casella, 5 years ago

Milestone: 1.14.01.16.0

Releasing 1.14.0 which is stable and has many improvements w.r.t. 1.13.2. This issue is rescheduled to 1.16.0

comment:7 by Francesco Casella, 5 years ago

Cc: Karim Adbdelhak added; Willi Braun Patrick Täuber Bernhard Bachmann Lennart Ochel removed
Owner: set to Andreas Heuermann

Interesting borderline case for tearing, discovered with the ScalableTestSuite library.

Tearing is ok from a structural point of view, but numerically it becomes ill-posed for N > 20.

Definitely not urgent, but anyway well worth having a look. Similar situations may arise in real life large user models, and will fail spectacularly.

in reply to:  7 comment:8 by Karim Adbdelhak, 5 years ago

Replying to casella:

Interesting borderline case for tearing, discovered with the ScalableTestSuite library.

Tearing is ok from a structural point of view, but numerically it becomes ill-posed for N > 20.

Definitely not urgent, but anyway well worth having a look. Similar situations may arise in real life large user models, and will fail spectacularly.

What makes it ill-posed and why at N > 20? Is the jacobian close to singularity?

in reply to:  description comment:9 by Francesco Casella, 5 years ago

It's explained in the description of the ticket

From what I understand, the problem is that the k+1-th torn variable is given by 3 times the k-th one, plus other terms. As a consequence, the N-th torn variable is ultimately depending on 3N times the tearing variable, which is obviously not going to be numerically well-posed for N much larger than 20.

x: tearing variable
vj: torn variables

v1 = 3*x;
v2 = 3*v1;
v3 = 3*v2;
...
VN = 3*V{N-1}

The sensitivity of the last torn variable to small variations of the tearing variable is 3N, which is really too much when N > 20, as a variation of 1e-10 of the tearing variable gives a variation of at least 0.36 of the torn variable. Hence, getting f(x) close enough to zero become very difficult to the solver because the effects of machine precision are greatly amplified.

Last edited 5 years ago by Francesco Casella (previous) (diff)

comment:10 by Francesco Casella, 4 years ago

Milestone: 1.16.01.17.0

Retargeted to 1.17.0 after 1.16.0 release

comment:11 by Francesco Casella, 4 years ago

Milestone: 1.17.01.18.0

Retargeted to 1.18.0 because of 1.17.0 timed release.

comment:12 by Francesco Casella, 3 years ago

Milestone: 1.18.0

Ticket retargeted after milestone closed

Note: See TracTickets for help on using tickets.