Opened 7 years ago

Closed 7 years ago

#4574 closed defect (fixed)

Reduce storage space for variables that are not time-varying

Reported by: Martin Sjölund Owned by: Lennart Ochel
Priority: high Milestone: 1.13.0
Component: Backend Version:
Keywords: Cc:

Description

Currently, variables that are not time-varying are detected and the equation is executed only once. However, the result is put into the data_2 matrix in the mat result-file, which is then unnecessarily large.

There is a test case in ticket:4569#comment:4

Change History (13)

comment:1 by Martin Sjölund, 7 years ago

Status: newassigned

comment:2 by massimo ceraolo, 7 years ago

BTW, a small question.
I understand tat data_1 is for quantities that do not vary along with time. So a single number per variable is sufficient. So, why two rows?

comment:3 by Martin Sjölund, 7 years ago

That has to do with copying the old Dymola mat-format where the start/stop-Time was stored for the "time" variable in the data_1 matrix. I agree it is pretty stupid and we can probably remove the second column (OM works fine loading those result-files already).

comment:4 by Martin Sjölund, 7 years ago

We now don't store start/stop-time in the data_1 matrix and store it in only 1 column.

comment:5 by massimo ceraolo, 7 years ago

Ah, good to know.

I learned the structure of the file from the txt output from Dymola which has the same structure as binary, with a few obvious modifications (and inversion rows-columns).

Explanation of data_1 was not clear there.
Now everything makes sense: Dymola's data_1 has two columns (rows in txt) but they are not really needed both, and thus OM has made the decision you told in your comment 4.
Thank you for the explanation.

in reply to:  4 ; comment:6 by Lennart Ochel, 7 years ago

Replying to sjoelund.se:

That has to do with copying the old Dymola mat-format where the start/stop-Time was stored for the "time" variable in the data_1 matrix. I agree it is pretty stupid and we can probably remove the second column (OM works fine loading those result-files already).

Replying to sjoelund.se:

We now don't store start/stop-time in the data_1 matrix and store it in only 1 column.

This may cause compatibility issues with tools that import mat files for the benefit of an insignificant reduction of the file size. Is it really worth it?

Version 1, edited 7 years ago by Lennart Ochel (previous) (next) (diff)

comment:7 by Lennart Ochel, 7 years ago

Status: assignedaccepted

in reply to:  6 comment:8 by Lennart Ochel, 7 years ago

Replying to lochel:

This may cause compatibility issues with tools that import mat files for the benefit of an insignificant reduction of the file size. Is it really worth it?

e.g. Dymola 2017 FD01

comment:9 by Francesco Casella, 7 years ago

Even though the .mat format is a de-facto standard introduced by a commercial tool, I think it is of the outmost importance that, in absence of a good standardized alternative, we keep this format readable by other tools and do not introduce omc-specific changes.

That said, I just ran this model

model M
  parameter Real p = 3;
  Real x = sin(p*time);
end M;

with v1.13.0-dev-155-g68350e9, and I was able to open the .mat file with Dymola 2017 FD01 despite the fact that data_1 has one row only. So I guess the compatibility issue isn't really there.

Once more, too bad that this is really not standardized and that the MA was not able to come up with a decent proposal for standardized file output format in all these years. Maybe we should try harder.

in reply to:  9 comment:10 by Lennart Ochel, 7 years ago

Replying to casella:

That said, I just ran this model

model M
  parameter Real p = 3;
  Real x = sin(p*time);
end M;

If you open the result file using Dymola 2017 FD01, then the stopTime is interpreted as 3 which means that all signals are scaled to a timeframe of [0, 3].
Other tools are probably not able to read the changed format at all (e.g. OMSimulator).

in reply to:  9 comment:11 by massimo ceraolo, 7 years ago

Once more, too bad that this is really not standardized and that the MA was not able to come up with a decent proposal for standardized file output format in all these years. Maybe we should try harder.

A small contribution to this subject.
The current de-facto standard is not bad, except that it is not written in a document.
Good things:

  • the matlab v4 format is very simple, fast, and does not require linking to external libraries to read and write
  • arrays contain an header specifying name (a string), number of rows and columns, and other relevant info. This should in principle allow even adding specific arrays (with specific names) without breaking readability by the others or backward compatibility.
  • there is a large base of simulation output already written with this format.

Therefore maybe just "making official" this format (with possible limited backwards-compatible enhancements) would be ok.

BTW, Dymola adds units of measure to "description" array, while currently OM does not. Ticket #4582 is considering modifying OM so that it can itself give unit info. That would be sure a backward-compatible change, since it will mime Dymola's way.

comment:12 by Lennart Ochel, 7 years ago

Actually, OMCompiler#1952 is using the old data_1 format. We could make the new slim data_1 format optional if required. However, I think that is not needed, because it doesn’t make much of a difference. I propose to discuss this in the next OSMC meeting on Monday.

in reply to:  12 comment:13 by Lennart Ochel, 7 years ago

Resolution: fixed
Status: acceptedclosed

Replying to lochel:

Actually, OMCompiler#1952 is using the old data_1 format. We could make the new slim data_1 format optional if required. However, I think that is not needed, because it doesn’t make much of a difference. I propose to discuss this in the next OSMC meeting on Monday.

Conclusion of the discussion on Monday: We keep the old (and not well designed) format for full backwards compatibility.

Note: See TracTickets for help on using tickets.