id	summary	reporter	owner	description	type	status	priority	milestone	component	version	resolution	keywords	cc
4395	Importance of residual scaling for the proper operation of the Kinsol solver	Francesco Casella	Willi Braun	"The Kinsol solver provides means to improve the numerical robustness by allowing to set scaling factors for the unknowns and for the residuals. In the current OMC implementation, the former is used, but the latter is not. 

I have prepared some simple test cases that demonstrate how this can actually be a problem, since the use of {{{SIunits}}}, which is the standard in Modelica, can often lead to such situations. For example, a steam power plant model can contain variables whose order of magnitude is 1e-6 (the turbine admittance, which is a manipulated variable for the power/frequency controller), and others whose order of magnitude is 1e9 (the turbine power, which is a controlled variable).

The test model (attached) represents the series connection of N nonlinear springs, with linearly distributed stiffness coefficients to add some asymmetry, just in case. The endpoint displacement is fixed to 1.

The model is more or less nonlinear depending on {{{d_0}}}. If {{{d_0 >> d_tot}}} the model is only slightly nonlinear around the solution, while it becomes severly nonlinear if {{{d_0 << d_tot}}}. The default is {{{d_0 = 10*d_tot}}}, so the model is only moderately nonlinear: the coefficients in the Jacobian change by at most 10% between the initial guess (which is trivially zero) and the solution.

I have set tearing off to get some more equations to the solver, since the most interesting use case for kinsol is large-scale systems without tearing. The unknowns are the individual relative displacements of the springs {{{s[N]}}}, the absolute displacements of each spring endpoint {{{d[N+1]}}}, and the force acting on the springs (one element of vector {{{F}}}, the others are aliases).

It is also possible to set {{{k_max = k_min}}} to make the analysis simpler, because then basically all elements of vector {{{s[N]}}} become equal and {{{d=linspace(0,1,N+1)}}}, so one only needs to track {{{s[1]}}} and {{{F[1]}}} to understand what happens, as everything else trivially depends on that. On the other hand, by doing this one may get some behaviour which crucially depends of the high (and somewhat artificial) level of simmetry of the problem.

The base class {{{Test}}} is such that the scaling of variables and residuals should be decent without the need of setting the nominal attributes.

{{{Test}}} has only 10 spring elements and the solver converges quickly, as expected.

{{{Test2}}} has 100 spring elements; the convergence at the first attempt is somewhat difficult. This is strange; as the nonlinearity is very mild, I would have expected a faster convergence rate. But that is based on my understanding of Newton's method with just scalar unknown; maybe the high number of dimensions adds more sensitivity to nonlinear behaviour? Or is it because Kinsol is a lot more than a damped quasi-Newton method?

{{{Test3}}} has 1000 spring elements; now the solver fails on the first attempt, and quickly converges on the second. It could definitely be interesting to understand what happens between the first and second attempt, but note that you can get the same behaviour with the smaller {{{Test4}}}.

{{{Test4}}} has 100 spring elements and {{{d_0 >> d_tot}}}, so it is only marginally nonlinear; the elements Jacobian change only on the 4th decimal digit. And yet, the first attempt does not converge, and only the second does. 

{{{Test5}}} has {{{d_0 = 1e7*d_tot}}}, so it is really practically linear, the elements of the Jacobian should change on their 7th digit (maybe we should double-check that, just in case). And yet, kinsol does not converge on the first attempt, as in {{{Test4}}}, which is now really weird.

In {{{Test6}}}, I change the stiffness parameters to a very large values, thus introducing a scaling problem on one unknown ({{{F[99]}}}) and on the set of equations
{{{
 s[i]*(1 + abs(s[i])/s_0)*k[i]*N = F[i];
}}}
whose residual have now order of magnitude 1e8. It is apparent that the solver has a lot more difficulty to converge now. But what is really interesting is that introducing the variable scaling (but not the residual scaling) in {{{Test7}}} via nominal attributes actually makes the problem worse, as the solver completely fails then (!)

My conclusion is thus that residual scaling is absolutely essential to handle models written in SI units with kinsol, as they can have very badly scaled residuals. Variable scaling is not enough. 

I'd encourage to implement some residual scaling mechanism, so that we can see what happens with this simple test case before venturing into the real-life, large-scale models. One option is to obtain the scaling factor for each residual row as the scalar product between the vector of scaling factors for the unknowns and the vector of the absolute values of the corresponding row of the Jacobian.

One should also set very small values for {{{k_min}}} and {{{k_max}}} and explore what happens if one unknown and N residuals are << 1 instead of being >> 1

"	enhancement	closed	high	1.13.0	Run-time		fixed