﻿id	summary	reporter	owner	description	type	status	priority	milestone	component	version	resolution	keywords	cc
4879	clang optimizations for large models	Francesco Casella	Martin Sjölund	"A few weeks ago the default optimization level for clang compilation was changed from {{{-O0}}} to {{{-Os}}}. This has a positive impact on simulation time, because the code is optimized for size and better fits the CPU cache, reducing the cache miss overhead. On the other hand, it increases the C-code compilation time.

Most of the models of the [https://libraries.openmodelica.org/branches/master/Buildings_latest/Buildings_latest.html MSL] only take a few seconds for C compilation also with {{{-Os}}}, and a few of them which have fairly large simulation times benefit a lot from that, e.g., {{{EngineV6}}} or {{{BranchingDynamicPipes}}}. 

On the other hand, the impact on many larger models, such as those in the {{{ScalableTestSuite}}}, in definitely unfavourable. In that library, the average effect is that the simulation time is rduced by 20-50%, but the C compilation time is increased 4X, and the latter is usually one or more orders of magnitude larger. 

For example, if you compare the performance [https://libraries.openmodelica.org/libraries/history/ScalableTestSuite_Experimental/ScalableTestSuite_Experimental-2018-03-01.html before] and [https://libraries.openmodelica.org/libraries/history/ScalableTestSuite_Experimental/ScalableTestSuite_Experimental-2018-04-10.html after] the change, 
the largest models {{{DistributionSystemModelica_N_112_M_112}}} and {{{DistributionSystemModelicaActiveLoads_N_80_M_80}}} jumped from 130 and 80 s to 590 and 260 s, while the reduction in simulation times was only a few seconds. That is clearly sub-optimal.

In general what is the best choice depends on the specific use case.

I guess the optimal solution would probably be to apply {{{-Os}}} to all parts that are executed during simulation, while applying {{{-O0}}} to all initialization-related code, which usually takes a neglibile fraction of the simulation time.

It is also advisable to optimize the C code, e.g. avoiding unnecessary macro expansions (see #4871), and moving all structural information (incidence matrices and similar) from the C sources to XML files.

For the time being, I would suggest to also run the ScalableTestSuite with {{{-O0}}}, so we can actually check all the other performance figures of big models, without incurring in timeouts. As C-compilation times takes the lion's share and is reduced dramatically, the additional burden on the testsuite is probably going to be limited."	enhancement	closed	high	1.13.0	Code Generation		fixed		Willi Braun
