Opened 7 years ago
Last modified 3 years ago
#4845 assigned defect
Tearing of linear systems produces singular system out of moderately-sized, well-posed models
Reported by: | Francesco Casella | Owned by: | Andreas Heuermann |
---|---|---|---|
Priority: | high | Milestone: | |
Component: | Backend | Version: | |
Keywords: | Cc: | Karim Adbdelhak |
Description
Please consider the ScalableTestSuite.Mechanical.HarmmonicOscillator.ScaledExperiments.HarmonicOscillatorNetwork_N_XX
test models. They require solving a linear system of N equations to compute the accelerations of the N point masses involved. For N > 40, the solver fails repeatedly with error messages like:
Failed to solve linear system of equations (no. 322) at time 0.000000. Residual norm is 14.0094302893603. The default linear solver fails, the fallback solver with total pivoting is started at time 0.000000. That might raise performance issues, for more information use -lv LOG_LS
In fact, the outcome is particularly bad, because the solver does not abort, but keeps on trying forever. Even if the Cancel simulation
button is pressed, the .exe file keeps on running in the background unless it is killed with the Process Manager, which is quite bad.
The debugger reveals that the linear system of size N is very effectively torn, ending up with just one tearing variable. Unfortunately, the resulting torn system is ill-conditioned for even moderate values of N.
From what I understand, the problem is that the k+1-th torn variable is given by 3 times the k-th one, plus other terms. As a consequence, the N-th torn variable is ultimately depending on 3N times the tearing variable, which is obviously not going to be numerically well-posed for N much larger than 20.
The system per se is well-posed and solved without problems also for much larger sizes if tearing is switched off, and possibly a sparse solver is used for large values of N.
Clearly, tearing is not a good idea to solve this class of systems. I think we need some mechanism to identify them and prevent (or limit) the use of tearing in such cases.. The current behaviour, i.e., get an ill-posed system and failing badly, is not acceptable.
BTW, note that I did not build this test cases specifically to cause this outcome.
Change History (12)
comment:1 by , 7 years ago
Owner: | removed |
---|---|
Status: | new → assigned |
comment:2 by , 7 years ago
Cc: | added |
---|
comment:3 by , 7 years ago
I don't have time to work on it myself and since Patrick left the dev team, I don't know to whom this should be assigned.
So I removed myself to indicate that I cannot take care of this issue.
comment:4 by , 7 years ago
OK, it looked strange that the ticket was assigned to no-one, but now I understand you can remove yourself while leaving the ticket unassigned.
comment:6 by , 5 years ago
Milestone: | 1.14.0 → 1.16.0 |
---|
Releasing 1.14.0 which is stable and has many improvements w.r.t. 1.13.2. This issue is rescheduled to 1.16.0
follow-up: 8 comment:7 by , 5 years ago
Cc: | added; removed |
---|---|
Owner: | set to |
Interesting borderline case for tearing, discovered with the ScalableTestSuite library.
Tearing is ok from a structural point of view, but numerically it becomes ill-posed for N > 20.
Definitely not urgent, but anyway well worth having a look. Similar situations may arise in real life large user models, and will fail spectacularly.
comment:8 by , 5 years ago
Replying to casella:
Interesting borderline case for tearing, discovered with the ScalableTestSuite library.
Tearing is ok from a structural point of view, but numerically it becomes ill-posed for N > 20.
Definitely not urgent, but anyway well worth having a look. Similar situations may arise in real life large user models, and will fail spectacularly.
What makes it ill-posed and why at N > 20? Is the jacobian close to singularity?
comment:9 by , 5 years ago
It's explained in the description of the ticket
From what I understand, the problem is that the k+1-th torn variable is given by 3 times the k-th one, plus other terms. As a consequence, the N-th torn variable is ultimately depending on 3N times the tearing variable, which is obviously not going to be numerically well-posed for N much larger than 20.
x: tearing variable
vj: torn variables
v1 = 3*x;
v2 = 3*v1;
v3 = 3*v2;
...
VN = 3*V{N-1}
The sensitivity of the last torn variable to small variations of the tearing variable is 3N, which is really too much when N > 20, as a variation of 1e-10 of the tearing variable gives a variation of at least 0.36 of the torn variable. Hence, getting f(x) close enough to zero become very difficult to the solver because the effects of machine precision are greatly amplified.
comment:11 by , 4 years ago
Milestone: | 1.17.0 → 1.18.0 |
---|
Retargeted to 1.18.0 because of 1.17.0 timed release.
@lochel why did you remove yourself as owner?