Opened 8 years ago
Last modified 3 years ago
#3963 new enhancement
Split C source code in more files for more efficient parallel compilation
Reported by: | Francesco Casella | Owned by: | Lennart Ochel |
---|---|---|---|
Priority: | high | Milestone: | |
Component: | Code Generation | Version: | |
Keywords: | Cc: |
Description
The MSL and ScalableTestSuite coverage logs
https://test.openmodelica.org/libraries/ModelicaTest_trunk/BuildModelRecursive.html
https://test.openmodelica.org/libraries/ScalableTestSuite_Experimental/BuildModelRecursive.html
reveal that the C compile time is always a significant fraction of the overall build time, and in many cases the bottleneck of the entire process.
Recently, the large header file that prevented the splitting of the C source files into much smaller chunks was removed by Martin S., but the number of C files is still the same. We should avoid very large files and split them into a larger number of smaller ones, e.g. by putting an upper bound to the number of functions contained in each .c file.
I guess the effort/gain ratio of this improvement would be extremely high, even if a comparatively crude splitting strategy was undertaken. Any volunteers?
Thanks!
Change History (15)
comment:1 by , 8 years ago
comment:2 by , 8 years ago
You can use the 64bit version of OpenModelica from the nightly-builds to access more memory:
https://build.openmodelica.org/omc/builds/windows/nightly-builds/
In the 1.9.6 release we only have 32bit version of gcc and executables which can only access 4G.
comment:3 by , 8 years ago
Sorry, I didn't mean to hijack this ticket. My point was more that the compilers shouldn't need to use 4GB of memory to compile a file, regardless of whether the compiler can handle it or not.
FYI, I am using the nightly 64-bit version. It was clearly a compiler issue since an "internal compiler error: segmentation fault" was returned. I even tried changing the stack size of the cc1.exe program. It may also be coincidence that it happened at 4GB.
comment:4 by , 8 years ago
@crupp: no worries, if this issue happened with 64bit GCC then we learned something valuable. Can you confirm that this is the case?
Of course we should try to split the C files. We would need to use some heuristics as compilation time will go down but link time will go up, so is an optimization problem.
comment:6 by , 8 years ago
Some work has been done on this, splitting 06_inz into multiple parts based on the flag --equationsPerFile
. But there are more files that need to be split in order to actually help the scalability of the compilation time. The default flag is set to 2000 equations per file.
comment:8 by , 8 years ago
Milestone: | 1.11.0 → 1.12.0 |
---|
Milestone moved to 1.12.0 due to 1.11.0 already being released.
comment:9 by , 7 years ago
Milestone: | 1.12.0 → 1.13.0 |
---|
Milestone moved to 1.13.0 due to 1.12.0 already being released.
comment:12 by , 5 years ago
Milestone: | 1.14.0 → 1.16.0 |
---|
Releasing 1.14.0 which is stable and has many improvements w.r.t. 1.13.2. This issue is rescheduled to 1.16.0
comment:14 by , 4 years ago
Milestone: | 1.17.0 → 1.18.0 |
---|
Retargeted to 1.18.0 because of 1.17.0 timed release.
This is actually a big problem in some cases. As an example, I have a moderately sized model with only 432 states, but 9k variables, 12k alias variables, and 12k parameters. The 06inz.c and 08bnd.c files are ~100k in size. On linux, they compile fine, but use a couple GB of RAM each. On Windows 64-bit, gcc fails with a seg fault when the memory used exceeds ~4GB. This may be a different issue (gcc seg faulting), but could be remedied by reducing the number of functions in each of these files.
I upvote this improvement.