Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#3588 closed defect (fixed)

diffModelicaFileListings consumes huge time and memory

Reported by: Adeel Asghar Owned by: Martin Sjölund
Priority: blocker Milestone: 1.9.4
Component: Interactive Environment Version:
Keywords: Cc: Adrian Pop, massimo ceraolo, Francesco Casella

Description

When diffModelicaFileListings is used with huge strings (which is common case when you working for example in MSL) it uses alot of CPU and memory. Most of the time crashing out with Too many heap sections error.

When I run the attached script on my machine then OMC is consuming 20% of CPU and almost more than 2GB of memory.

Attachments (1)

script.mos (668 bytes ) - added by Adeel Asghar 9 years ago.

Download all attachments as: .zip

Change History (14)

by Adeel Asghar, 9 years ago

Attachment: script.mos added

comment:1 by Adeel Asghar, 9 years ago

Cc: massimo ceraolo Francesco Casella added

comment:2 by Martin Sjölund, 9 years ago

Well, diff algorithms scale O(ND). There is a huge number of diffs in there, and Myer's diff algorithm will take upwards of 2e10 iterations :)

Perhaps it would be better to implement a simple Modelica parser on top of the lexer (possibly just scanning for semicolons and ignoring comments to make a grouping), in order to only perform diffs on elements that were changed (similar to a line diff)?

comment:3 by Martin Sjölund, 9 years ago

The thing that consumes the memory is the myer's paths. So we need to do something about the diff algorithm before a release :)

comment:4 by Adrian Pop, 9 years ago

As I understand it, the changes are rather localized to a class so we should only apply the diff algorithm to that part of the file or buffer. Shouldn't this be possible?

comment:5 by Martin Sjölund, 9 years ago

We try to do this, but I guess in this case the unparsing does not match the Dymola one very well, and so we fail and resort to a general diff. But yes, it would be possible if we add some grouping of tokens into elements or similar, and do a diff on those to find out which parts of the file changed.

comment:6 by Francesco Casella, 9 years ago

Sorry for the naive question, is this related to the new commits by Adeel? Or was it already there before?

comment:7 by Martin Sjölund, 9 years ago

This is new functionality

comment:8 by Martin Sjölund, 9 years ago

My new approach handles this a lot faster:

Notification: Performance of diffModelicaFileListings scan string 1: time 0.02594/0.02594
Notification: Performance of diffModelicaFileListings parse string 1: time 0.06621/0.09221
Notification: Performance of diffModelicaFileListings scan string 2: time 0.03426/0.1265
Notification: Performance of diffModelicaFileListings parse string 2: time 0.2994/0.426
Notification: Performance of treeDiff: time 0.001079/0.4271

But it does not yet consider whitespace changes and indentation in as nice a way as the previous approach. But at least the basic algorithm runs a lot faster. Note that it will run slower once the whitespace issue is fixed (because it will go deeper into the trees to find out where the actual diff resides; right now it just thinks the diffs are too many and outputs suboptimal results).

It is based on an LL(k) hand-writen recursive-descent parser (written in MetaModelica). I had intended to use ANTLR, but the unmaintained C version was not powerful enough.

The treeDiff algorithm can be improved as well (does a lot of substring calls to compare tokens and print the output to Print.mo, which currently dominates execution time; adding support to do a string compare with offset as well as Print.mo taking an offset+length). This is more apparent for larger models (then treeDiff takes longer than parsing even though the number of diffs is only equal to 2; this is simply a scaling issue).

comment:9 by Martin Sjölund, 9 years ago

The diff looks almost good now (updateComponent moves the component, which it should not do). And there is a whitespace issue for the within clause...

--- old.mo  2015-12-29 09:55:28.218333811 +0100
+++ merged.mo 2015-12-29 09:55:29.442376401 +0100
@@ -1,5 +1,4 @@
-within Modelica.Electrical;
-package PowerConverters "Rectifiers, Inverters and DC/DC converters"
+within Modelica.Electrical;package PowerConverters "Rectifiers, Inverters and DC/DC converters"
   extends Modelica.Icons.Package;
   package UsersGuide "User's Guide"
     extends Modelica.Icons.Information;
@@ -3138,7 +3137,7 @@
 
       package MultiPhaseTwoLevel "Multi phase two level inverter example"
         extends Modelica.Icons.ExamplesPackage;
-        model MultiPhaseTwoLevel_R "Multi phase DC to AC converter with R load"
+                model MultiPhaseTwoLevel_R "Multi phase DC to AC converter with R load"
           import Modelica_Electrical_PowerConverters =
             Modelica.Electrical.PowerConverters;
           extends Modelica.Icons.Example;
@@ -3188,13 +3187,6 @@
             offset=fill(0.5, m),
             freqHz=fill(f1, m)) annotation (Placement(transformation(extent={{-30,
                     -64},{-50,-44}})));
-          Modelica.Blocks.Math.Harmonic fundamentalWaveCurrent[m](
-            each k=1,
-            each x0Cos=0,
-            each x0Sin=0,
-            each f=f1) annotation (Placement(transformation(
-                extent={{-10,-10},{10,10}},
-                origin={90,-50})));
           Modelica.Blocks.Math.Harmonic fundamentalWaveVoltage[m](
             each k=1,
             each x0Cos=0,
@@ -3213,6 +3205,11 @@
                 extent={{-10,-10},{10,10}},
                 rotation=270,
                 origin={40,-90})));
+          Modelica.Blocks.Math.Harmonic fundamentalWaveCurrent[m](
+            each k=1,
+            each x0Cos=0,
+            each x0Sin=0,
+            each f=f1) annotation (Placement(visible = true, transformation(origin = {90, -48}, extent = {{-10, -10}, {10, 10}}, rotation = 0)));
         equation
           connect(constantVoltage_p.n, constantVoltage_n.p) annotation (Line(
               points={{-70,40},{-70,20}},

Total time to process:

Notification: Performance of diffModelicaFileListings scan string 1: time 0.2323/0.2323, GC stats:
  heapsize_full: 295669760
  free_bytes_full: 77459456
  unmapped_bytes: 0
  bytes_allocd_since_gc: 8094992
  allocd_bytes_before_gc: 390124848
  non_gc_bytes: 0
  gc_no: 14
  markers_m1: 7
  bytes_reclaimed_since_gc: 83400400
  reclaimed_bytes_before_gc: 98056592
Notification: Performance of diffModelicaFileListings parse string 1: time 0.04121/0.2735, GC stats:
  heapsize_full: 295669760
  free_bytes_full: 54882304
  unmapped_bytes: 0
  bytes_allocd_since_gc: 35319632
  allocd_bytes_before_gc: 390124848
  non_gc_bytes: 0
  gc_no: 14
  markers_m1: 7
  bytes_reclaimed_since_gc: 88051184
  reclaimed_bytes_before_gc: 98056592
Notification: Performance of diffModelicaFileListings scan string 2: time 0.03661/0.3102, GC stats:
  heapsize_full: 295669760
  free_bytes_full: 41791488
  unmapped_bytes: 0
  bytes_allocd_since_gc: 48267808
  allocd_bytes_before_gc: 390124848
  non_gc_bytes: 0
  gc_no: 14
  markers_m1: 7
  bytes_reclaimed_since_gc: 88060032
  reclaimed_bytes_before_gc: 98056592
Notification: Performance of diffModelicaFileListings parse string 2: time 0.04889/0.3592, GC stats:
  heapsize_full: 295669760
  free_bytes_full: 13393920
  unmapped_bytes: 0
  bytes_allocd_since_gc: 76694832
  allocd_bytes_before_gc: 390124848
  non_gc_bytes: 0
  gc_no: 14
  markers_m1: 7
  bytes_reclaimed_since_gc: 88093776
  reclaimed_bytes_before_gc: 98056592
Notification: Performance of treeDiff: time 0.7631/1.122, GC stats:
  heapsize_full: 295669760
  free_bytes_full: 11444224
  unmapped_bytes: 0
  bytes_allocd_since_gc: 78706992
  allocd_bytes_before_gc: 390124848
  non_gc_bytes: 0
  gc_no: 14
  markers_m1: 7
  bytes_reclaimed_since_gc: 88176848
  reclaimed_bytes_before_gc: 98056592

So all garbage collection happens during parsing of MSL for the script. Still, requires 13 MB memory for lexing, 26 MB for parsing. The tree diff only required around 2 MB :)

comment:10 by Martin Sjölund, 9 years ago

I fixed updateComponent, so it no longer moves components when updated...

Latest stats:

Notification: Performance of diffModelicaFileListings scan string 1: time 0.02745/0.02745
Notification: Performance of diffModelicaFileListings parse string 1: time 0.07575/0.1033
Notification: Performance of diffModelicaFileListings scan string 2: time 0.281/0.3844
Notification: Performance of diffModelicaFileListings parse string 2: time 0.03475/0.4192
Notification: Performance of treeDiff: time 0.6675/1.087

Diff is now acceptable:

  • .mo

    old new  
    31923192            each k=1,
    31933193            each x0Cos=0,
    31943194            each x0Sin=0,
    3195             each f=f1) annotation (Placement(transformation(
    3196                 extent={{-10,-10},{10,10}},
    3197                 origin={90,-50})));
     3195            each f=f1) annotation (Placement(visible = true, transformation(origin = {90, -48}, extent = {{-666, -10}, {10, 10}}, rotation = 0)));
    31983196          Modelica.Blocks.Math.Harmonic fundamentalWaveVoltage[m](
    31993197            each k=1,
    32003198            each x0Cos=0,

comment:11 by Martin Sjölund, 9 years ago

Resolution: fixed
Status: newclosed
Summary: diffModelicaFileListings consumes huge memorydiffModelicaFileListings consumes huge time and memory

comment:12 by Martin Sjölund, 9 years ago

Milestone: 1.9.41.9.4-1.9.x

Milestone renamed

comment:13 by Martin Sjölund, 9 years ago

Milestone: 1.9.4-1.9.x1.9.4

Milestone renamed

Note: See TracTickets for help on using tickets.