Opened 3 years ago

Last modified 3 years ago

#6350 new defect

OMEdit cannot handle simulations with more than 10000 variable comfortably

Reported by: casella Owned by: adeas31
Priority: critical Milestone: 2.0.0
Component: OMEdit Version: v1.17.0-dev
Keywords: Cc: Karim.Abdelhak, adrpo, lochel, phannebohm, simone1.bosotti@…

Description

I have recently tried to simulate some models with a few tens of thousands of equations, see, e.g., #6271. OMC is not super-fast in this size range, but kind of acceptable, maybe a few minutes for code generation and a few seconds for simulation. When that is finished, the pain begins: switching from the Modelling to the Plotting window sometimes takes forever, for no obvious reason.

To reproduce the problem, please try the attached simple test package. When you run M_10000, visualization is a breeze. But if you then run M_15000, and possibly M_20000, switching to the plotting window may take minutes. With no clue how long it could take.

This should be investigated and fixed for 1.18.0. If we plan on supporting large models with frontend and backend, OMEdit should be up to the task.

Attachments (1)

TestManyVariables.mo (1.1 KB) - added by casella 3 years ago.

Download all attachments as: .zip

Change History (16)

Changed 3 years ago by casella

comment:1 Changed 3 years ago by anonymous

there is a long wait when switching for the first-time to the plotting perspective.
Probably it takes too much time to create the list x[N] when N is large.
Maybe allocation should be made in chunks instead of per individual variables?

comment:2 Changed 3 years ago by casella

What I also noticed is that the effects are strongly nonlinear. Up to a certain size, everything is nearly instantaneous, above a certain threshold (which is of the order of tens of thousands) things rapidly deteriorate up to being unusable.

comment:3 follow-up: Changed 3 years ago by adeas31

  • Cc Karim.Abdelhak added

Made some improvements in PR#7298. It is rather quick in adding variables to the tree. However, quite a lot of time is spent reading the model_init.xml and model_info.json files. The reason is that we store the variables in a rather bad format. So for an array we do something like,

<ScalarVariable
    name = "x[1]"
    valueReference = "1000"
    variability = "continuous" isDiscrete = "false"
    causality = "local" isValueChangeable = "false"
    alias = "noAlias"
    classIndex = "0" classType = "rAlg"
    isProtected = "false" hideResult = "false"
    fileName = "C:/Users/adeas31/Downloads/TestManyVariables.mo" startLine = "4" startColumn = "5" endLine = "4" endColumn = "14" fileWritable = "true">
    <Real start="0.0" fixed="false" useNominal="false" />
  </ScalarVariable>
  <ScalarVariable
    name = "x[2]"
    valueReference = "1001"
    variability = "continuous" isDiscrete = "false"
    causality = "local" isValueChangeable = "false"
    alias = "noAlias"
    classIndex = "1" classType = "rAlg"
    isProtected = "false" hideResult = "false"
    fileName = "C:/Users/adeas31/Downloads/TestManyVariables.mo" startLine = "4" startColumn = "5" endLine = "4" endColumn = "14" fileWritable = "true">
    <Real start="0.0" fixed="false" useNominal="false" />
  </ScalarVariable>

I propose we store the array variables like this,

First alternative

<ScalarVariable
    name = "x"
    start = 1
    end = 15000
    valueReference = "1000"
    variability = "continuous" isDiscrete = "false"
    causality = "local" isValueChangeable = "false"
    alias = "noAlias"
    classIndex = "0" classType = "rAlg"
    isProtected = "false" hideResult = "false"
    fileName = "C:/Users/adeas31/Downloads/TestManyVariables.mo" startLine = "4" startColumn = "5" endLine = "4" endColumn = "14" fileWritable = "true">
    <Real start="0.0" fixed="false" useNominal="false" />
  </ScalarVariable>

Second alternative

<ScalarVariable
    name = "x[1-15000]"
    valueReference = "1000"
    variability = "continuous" isDiscrete = "false"
    causality = "local" isValueChangeable = "false"
    alias = "noAlias"
    classIndex = "0" classType = "rAlg"
    isProtected = "false" hideResult = "false"
    fileName = "C:/Users/adeas31/Downloads/TestManyVariables.mo" startLine = "4" startColumn = "5" endLine = "4" endColumn = "14" fileWritable = "true">
    <Real start="0.0" fixed="false" useNominal="false" />
  </ScalarVariable>

This will make reading of files really quick. To speed up things a bit more we also need to make this file store variables in a hierarchy. That allows us to fix other tickets like #5309.

comment:4 Changed 3 years ago by Karim.Abdelhak

Shouldn't have different entries like <ArrayVariable> instead?

comment:5 Changed 3 years ago by Karim.Abdelhak

I think the idea of hierarchy and array variables in our xml files is great, i don't know how much work it is to actually change everything that has to interpret them (C, C++, FMU, OMEdit, ...). After a quick look in the current implementation i am pretty certain that the generation of those files can be done a little quicker, the problem is that we discard the hierarchy information after flattening in the frontend. I would need to recollect it or somehow have the class information still available. But as far as i can see this would be the lesser problem, can you assess the effort needed to change all the interpreters of those files?

comment:6 Changed 3 years ago by adeas31

  • Cc adrpo lochel phannebohm added

Adding more people to this discussion.

I am not certain about all the areas where this file is used. But here is a list,

adeas31@win01283 MINGW64 /C/OpenModelica
# grep -rn --include=\*.{h,c,cpp} --exclude-dir=build . -e _init.xml
./OMCompiler/SimulationRuntime/c/simulation/simulation_input_xml.c:33: * this file reads the model input from Model_init.xml
./OMCompiler/SimulationRuntime/c/simulation/simulation_input_xml.c:448:      if (0 > GC_asprintf(&filename, "%s/%s_init.xml", omc_flagValue[FLAG_INPUT_PATH], modelData->modelFilePrefix)) {
./OMCompiler/SimulationRuntime/c/simulation/simulation_input_xml.c:454:      if (0 > GC_asprintf(&filename, "%s_init.xml", modelData->modelFilePrefix)) {
./OMCompiler/SimulationRuntime/c/util/simulation_options.c:207:  /* FLAG_INPUT_PATH */                   "value specifies a path for reading the input files i.e., model_init.xml and model_info.json",
./OMCompiler/SimulationRuntime/c/util/simulation_options.c:404:  "  Value specifies a path for reading the input files i.e., model_init.xml and model_info.json",
./OMCompiler/SimulationRuntime/OMSI/base/src/omsi_initialization.c:185:    sprintf(initXMLFilename, "%s/%s_init.xml", omsi_resource_location, modelName);
./OMCompiler/SimulationRuntime/OMSI/base/src/omsi_input_xml.c:37: * \brief Process modelName_init.xml file
./OMCompiler/SimulationRuntime/OMSI/base/src/omsi_input_xml.c:39: * Functions to process informations from \<modelName\>_init.xml file in
./OMCompiler/SimulationRuntime/OMSI/base/src/omsi_input_xml.c:56: * \brief Processes modelName_init.xml file to get additional model infos.
./OMCompiler/SimulationRuntime/OMSI/base/src/omsi_input_xml.c:62: * \param filename          Absolute path to modelName_init.xml file
./OMCompiler/SimulationRuntime/OMSI/base/src/omsi_input_xml.c:63: * \param fmuGUID           Globally unique identifier to check that modelName_init.xml
./OMCompiler/SimulationRuntime/OMSI/base/src/omsi_input_xml.c:247:     * ToDo: Solution 3: Edit generation of ..._init.xml and add aliasVariableValueReference
./OMCompiler/SimulationRuntime/OMSI/include/omsi.h:521:    omsi_int        id;                     /* unique value reference from *_init.xml */
./OMCompiler/SimulationRuntime/OMSICpp/omsi/src/fmi2/detail/omsi_fmi2_wrapper.cpp:598:fs::path model_name_path(_model->getModelName() + ("_init.xml"));
./OMEdit/OMEditLIB/Plotting/VariablesWidget.cpp:522: * Parses the model_init.xml file and returns the scalar variables information.
./OMEdit/OMEditLIB/Plotting/VariablesWidget.cpp:581:  /* open the model_init.xml file for reading */
./OMEdit/OMEditLIB/Plotting/VariablesWidget.cpp:585:    initFileName = QString("%1_init.xml").arg(simulationOptions.getOutputFileName());
./OMEdit/OMEditLIB/Plotting/VariablesWidget.cpp:588:    initFileName = QString("%1_init.xml").arg(text);
./OMEdit/OMEditLIB/Plotting/VariablesWidget.cpp:1627:  /* Update the _init.xml file with new values. */
./OMEdit/OMEditLIB/Plotting/VariablesWidget.cpp:1628:  /* open the model_init.xml file for writing */
./OMEdit/OMEditLIB/Plotting/VariablesWidget.cpp:1629:  QString initFileName = QString(simulationOptions.getOutputFileName()).append("_init.xml");
./OMEdit/OMEditLIB/Simulation/SimulationOutputWidget.cpp:383:    /* className_init.xml tab */
./OMEdit/OMEditLIB/Simulation/SimulationOutputWidget.cpp:384:    addGeneratedFileTab(QString("%1/%2%3").arg(workingDirectory, outputFile).arg("_init.xml"));
./OMOptim/OMOptim/Core/OpenModelica/ModPlusOMCtrl.cpp:54:    _initFileXml = _ModelPlus->modelName()+"_init.xml";

comment:7 in reply to: ↑ 3 ; follow-up: Changed 3 years ago by ceraolo

This will make reading of files really quick. To speed up things a bit more we also need to make this file store variables in a hierarchy. That allows us to fix other tickets like #5309.

Maybe also #5177?

comment:8 in reply to: ↑ 7 Changed 3 years ago by casella

Replying to ceraolo:

Maybe also #5177?

Yes, the idea is to also take care of that. And, possibly, be consistent with the forthcoming implementation of FMI 3.0, which can handle arrays natively.

comment:9 follow-up: Changed 3 years ago by adeas31

@casella I did some more debugging and it was model_info.json that was slow instead of model_init.xml. Basically the custom Qt library (http://qjson.sourceforge.net/) that we use for json parsing is not very efficient for big data.

I made some more optimizations with PR#7315 and PR#7317, the later one is waiting for approval from Martin. So with these optimizations the biggest model M_40000 takes around 3-4 seconds approx. compared to few minutes.

comment:10 in reply to: ↑ 9 Changed 3 years ago by casella

  • Cc simone1.bosotti@… added

Replying to adeas31:

@casella I did some more debugging and it was model_info.json that was slow instead of model_init.xml. Basically the custom Qt library (http://qjson.sourceforge.net/) that we use for json parsing is not very efficient for big data.

In fact, I was spending some time with my students today, trying to debug a 12,000 equations model of a gas network, and it still took a substantial amount of time to load the output data.

I made some more optimizations with PR#7315 and PR#7317, the later one is waiting for approval from Martin. So with these optimizations the biggest model M_40000 takes around 3-4 seconds approx. compared to few minutes.

Excellent!

@simone, when you see that PR#7317 has been merged in (purple icon), please download the nightly on the following day and try it out.

comment:11 Changed 3 years ago by casella

@adeas31, I guess we have a related issue with the Transformational Debugger.

I ran the ScalableTestGrids.Models.Type1_N_2_M_4 model (12,000 equations), with the latest nightly and it took 20 seconds to display the debugger windows. I then ran ScalableTestGrids.Models.Type1_N_3_M_4, which has 28,000 equations, and it took over two minutes.

Of course it is more likely that large models fail, compared to small ones, so being able to manage models over 10,000 equations in the debugger is essential.

Do your PRs also address this automatically, or is it a separate issue?

comment:12 Changed 3 years ago by adeas31

Yes it was the same problem with Transformational Debugger.

I have fixed it in 2d67bab/OpenModelica.

comment:13 Changed 3 years ago by casella

Great, thanks!

@simone, would you mind trying this out with a nightly build next week and report?

comment:14 Changed 3 years ago by Simone Bosotti <simone1.bosotti@…>

I tried my model with about 12000 variables, and the speed has been substantially increased.
Time to load the results after the simulation:

  • old builds: from 10 to 15 seconds
  • new builds: from 5 to 10 seconds

Time to load the transformational debugger:

  • old builds: not measured with accuracy, but more than 3 minutes
  • new builds: less than 20 seconds

The last test has been performed on OpenModelica v1.18.0-dev-199-g8309b36462 (64-bit)
Windows 10, Intel i7, 16 GB RAM.

comment:15 Changed 3 years ago by casella

  • Milestone changed from 1.18.0 to 2.0.0

Thanks @simone for reporting!

@adeas31, I think what was achieved so far is quite good already, though there is probably some margin for further improvements. I am rescheduling this issue for 2.0.0, since I guess we have a lot of higher priority improvements to worry about at the moment. Feel free to make improvments for 1.18.0 if you have something in the pipeline already.

For testing, I recommend to use the ScalableTestGrids library, which has plenty of models of substantial size, with samples of doubling size to test scalability. First and foremost, we have to make sure that the time spent to load results does not scale in a superlinear or unpredictable way with the model size. Maybe that is already the case after your latest improvements. Then, you may want to check if there is any bottleneck that can be removed to further improve the performance without too much effort.

Thanks!

Note: See TracTickets for help on using tickets.