Opened 4 years ago

Closed 4 years ago

#6202 closed defect (fixed)

Make the NF the default also from the command line

Reported by: Francesco Casella Owned by: Adrian Pop
Priority: blocker Milestone: 1.17.0
Component: New Instantiation Version:
Keywords: Cc: Per Östlund, Martin Sjölund, Adeel Asghar

Description

It's time to make the NF the default choice for compilation also from scripts and command line, without the need of setting "-d=newInst".

We should first try this out on master and see what is the outcome on the library testsuite, to assess if we can do this in 1.17.0 before feature freeze next week, see plan in #6145

Change History (27)

comment:1 by Adrian Pop, 4 years ago

Cc: Per Östlund added

@perost has already made a try:
https://github.com/OpenModelica/OpenModelica/pull/6902
452 failures on first try:
https://test.openmodelica.org/jenkins/blue/organizations/jenkins/OpenModelica/detail/PR-6902/1/tests

@perost is trying now again with some new changes.

comment:2 by Per Östlund, 4 years ago

I actually forgot to enable the newInst flag the first time, so the failures where only due to adding the cflags line to the tests which changed the line numbers in error messages. Turning on the flag caused only one more test to fail though (because it previously had an empty cflags entry, which I didn't account for). I will fix the other tests (it's only 226 tests since we test with both GCC and Clang) and push in the change for the testsuite without enabling the flag so at least that's done.

Turning the flag on revealed some other issues though. We currently run the compliance testsuite with both frontend, but turning the flag on runs it with the NF both times. There also seems to be some issues with the cross-build-fmu job, not sure if we want to fix that to run with the NF or just continue to use the OF.

in reply to:  2 comment:3 by Francesco Casella, 4 years ago

Replying to perost:

I actually forgot to enable the newInst flag the first time, so the failures where only due to adding the cflags line to the tests which changed the line numbers in error messages. Turning on the flag caused only one more test to fail though (because it previously had an empty cflags entry, which I didn't account for). I will fix the other tests (it's only 226 tests since we test with both GCC and Clang) and push in the change for the testsuite without enabling the flag so at least that's done.

Good!

Turning the flag on revealed some other issues though. We currently run the compliance testsuite with both frontend, but turning the flag on runs it with the NF both times.

Should we just stop running the compliance testsuite with the old frontend? It doesn't make much sense to keep on testing it, since we haven't been maintaning it for a long time. Doing so will only show that we have obsolete stuff around that has lots of issues.

There also seems to be some issues with the cross-build-fmu job, not sure if we want to fix that to run with the NF or just continue to use the OF.

I don't see any reason why we should run it with the OF, unless I am missing something I'd definitely fix it to run with the NF as well.

comment:4 by Per Östlund, 4 years ago

The testsuite is now updated so that all test cases explicitly say which frontend to use, so that we can turn on the NF by default without affecting the testsuite. We might want to transition some parts of the testsuite to use the NF later but that's now a separate discussion.

comment:5 by Francesco Casella, 4 years ago

I agree that having fine control of which FE is used in the testsuite can be a nice feature. But yet, if we want to change the default FE, we should of course test if using the NF instead of the OF has negative consequences. And we should do it everywhere, as long as we get separate reports for the different parts of the testsuite and there is no possibility for confusion.

So, I guess you should now make one commit where all tests in the testsuite switch to the NF, right away, so we can see what the status is.

What do you think?

in reply to:  5 comment:6 by Per Östlund, 4 years ago

Replying to casella:

So, I guess you should now make one commit where all tests in the testsuite switch to the NF, right away, so we can see what the status is.

What do you think?

All the old tests now explicitly use the OF, so that might be a bit complicated. But the whole point of setting the frontend explicitly in all the old tests was so that we can switch to the NF now and then migrate the old tests to use the NF at our own pace.

We already have a lot of tests for the NF, and also the library coverage tests, so migrating the old tests to use the NF is not really a high priority right now.

comment:7 by Francesco Casella, 4 years ago

I'm fine with keeping 'old tests' on the OF for a while. I just want to make sure we won't be shipping OMC 1.17.0 with default NF if we haven't tested that all its bits and pieces work fine. Not with old stuff, but with contemporary stuff.

comment:8 by Francesco Casella, 4 years ago

Cc: Martin Sjölund Adeel Asghar added

OK, I'll try to summarize and propose a plan.

We have a core testuite that is run on each PR, and several library testsuites that try out open source libraries with different code generation combinations. @perost, in comment:4 you write that

The testsuite is now updated so that all test cases explicitly say which frontend to use, so that we can turn on the NF by default without affecting the testsuite.

I guess you were referring to the core testsuite. In that case, I really think we should switch all those tests to the NF, either at once or in batches, it really makes no sense to continue testing the old one, which is obsolete, and not the new one, which is the one we are currently developing and maintaining (and possibly breaking if something goes wrong with a commit). I imagine this will cause a ton of false negatives because of spurious effects like equations changing the order of appearence in the flat model, so we'll need to spend some time fixing the reference test results, but we have to go through that at some point anyway.

Regarding the library testsuite, we run several jobs with specific settings

  • master: default settings
  • newInst: sets -d=newInst
  • daeMode: sets --daeMode=true
  • master-fmi: default settings, builds and runs FMUs
  • newInst-daeMode: sets both -d=newInst and -d=daeMode

The choice of C vs. C++ runtime and of optimizations is made individually for each test inside those jobs, i.e. some libraries are also tested with C++ or with, say -O0 instead of -O3. Maybe we should eventually have jobs that do that for the entire set of libraries, but we can do this later.

I haven't made any systematic comparisons of master vs. newInst recently on the entire library suite, but my impression on several important libraries like MSL, ModelicaTest, Buildings, PowerSystems, is that the NF has now a lot more success cases and a few (if any!) failures that work with the OF. I think we got rid of most, if not all, those cases a long time ago. Maybe we should check that once and for all.

I understand this can simply be done by making the NF the default, running those jobs once, and looking at the regression reports, e.g., this one, that will tell us if there is something failing on that occasion. Note that here we just compare the outcome of the process (success vs. fail), there are no detailed checks on the textual output that can go wrong as in the core testuite, which is good because we don't have to worry about false negatives. We should just make sure that we have a library testuite regression report that only contains the commit changing the default NF, we have to agree on a moratorioum of merging PRs for one day while we do that. @perost, you could take care of that.

Once that is done, @perost can start gradually moving the core testsuite tests to the NF (there's no hurry to do that, and you can do that in batches) and we should should change the jobs to all use the default NF, except one with the OF for reference:

  • master: default settings (i.e. NF turned on by default)
  • nonewInst: sets -d=nonewInst (so we can still check if something works with the OF, but not with the NF)
  • daeMode: sets --daeMode=true (with NF turned on by default)
  • master-fmi: default settings, builds and runs FMUs (with NF turned on by default)

The newInst and newInst-daeMode jobs should be ditched, since the "new" frontend is now the default one, there no need to refer to it explicitly anymore.

@sjoelund.se, I think you can take care of this part as soon as @perost has made the switch to default NF.

Last, but not least, @adeas31 should update OMEdit to be consistent with the new situation. If there are only a few or no models that run with the OF but not with the NF, we may remove the OF fallback option, which is currently a bit annoying since it now mostly gives a false impression that there is some problem with OMC, while the problem is actually your Modelica code being wrong. We can make a decision on that once we see the results of the library suite regression test.

Comments?

comment:9 by Martin Sjölund, 4 years ago

Which machine should be used? We can only use 1 machine if we want all the results in the same report.

comment:10 by Per Östlund, 4 years ago

My plan was to try and maintain the status quo for now, i.e. just making sure that all the jobs that currently uses the OF continue to do so after the switch. The testsuite has already been fixed so that it doesn't break when switching frontend, so I've just been waiting for someone to add -d=-newInst to the library and compliance testing jobs for the OF so we can switch without anything changing.

Once the switch is done we can start looking at renaming jobs and updating the testsuite to use the NF more, but I think it's better to do one thing at a time.

in reply to:  9 comment:11 by Francesco Casella, 4 years ago

Replying to sjoelund.se:

Which machine should be used? We can only use 1 machine if we want all the results in the same report.

That's up to you. Even if we get a few reports it's ok

in reply to:  10 ; comment:12 by Francesco Casella, 4 years ago

Replying to perost:

My plan was to try and maintain the status quo for now, i.e. just making sure that all the jobs that currently uses the OF continue to do so after the switch. The testsuite has already been fixed so that it doesn't break when switching frontend, so I've just been waiting for someone to add -d=-newInst to the library and compliance testing jobs for the OF so we can switch without anything changing.

@perost, I cannot follow you, it seems to me that you are asking for two contradictory things. On one hand you want that all jobs that use the OF continue to do so, on the other heand you want to add -d=-newInst to the jobs for the OF. I'm a bit lost.

I see two options. One is that you switch the default frontend to NF and we just run the same jobs and get regression test reports that tell us if something gets broken with that. The other one is that @sjoelund add -d=-newInst to the master, daeMode, and master-fmi jobs, that will have the same effect.

I don't really see the point to switch the default FE while forcing all test jobs to keep doing the same. I want to see what happens when we switch, and for that the jobs that used the OF must use the NF, so we get a regression test.

Do I miss something?

Once the switch is done we can start looking at renaming jobs and updating the testsuite to use the NF more, but I think it's better to do one thing at a time.

My proposal is to go through these steps

  1. change the default FE and get regression test reports)
  2. update the test jobs to test the OF only on master and remove obsolete jobs
  3. gradually move all the core testsuite tests to use the default FE (i.e., the NF)

in reply to:  12 ; comment:13 by Per Östlund, 4 years ago

Replying to casella:

@perost, I cannot follow you, it seems to me that you are asking for two contradictory things. On one hand you want that all jobs that use the OF continue to do so, on the other heand you want to add -d=-newInst to the jobs for the OF. I'm a bit lost.

-d=-newInst == -d=nonewInst :)

My proposal is to go through these steps

  1. change the default FE and get regression test reports)
  2. update the test jobs to test the OF only on master and remove obsolete jobs
  3. gradually move all the core testsuite tests to use the default FE (i.e., the NF)

That's fine, but to be able to merge the PR we need to at least fix the OF compliance testsuite job to continue using the OF. It's currently blocking the merge since it thinks there are failing test cases.

in reply to:  13 ; comment:14 by Francesco Casella, 4 years ago

Replying to perost:

-d=-newInst == -d=nonewInst :)

Aha. I don't think I did not understand what you didn't want to not do ;)

My proposal is to go through these steps

  1. change the default FE and get regression test reports)
  2. update the test jobs to test the OF only on master and remove obsolete jobs
  3. gradually move all the core testsuite tests to use the default FE (i.e., the NF)

That's fine, but to be able to merge the PR we need to at least fix the OF compliance testsuite job to continue using the OF. It's currently blocking the merge since it thinks there are failing test cases.

@sjoelund.se, could you please add "-d=nonewInst" to that job?

in reply to:  14 ; comment:15 by Martin Sjölund, 4 years ago

Replying to casella:

@sjoelund.se, could you please add "-d=nonewInst" to that job?

I'm not sure what you two want to do... We cannot rename master. And I would propose that if we change to NF by default then the current "master" job should simply be "master". So just update it, get one large report with diffs.

Then remove the newInst job and create an OF job in its place. Change daeMode to use -d=-newInst. And... that's it.

comment:16 by Per Östlund, 4 years ago

The new frontend has now been enabled by default without changing any of the Jenkins jobs, so if anything breaks or catches fire we'll just have to deal with it.

Last edited 4 years ago by Per Östlund (previous) (diff)

in reply to:  15 comment:17 by Francesco Casella, 4 years ago

Replying to sjoelund.se:

Replying to casella:

@sjoelund.se, could you please add "-d=nonewInst" to that job?

I'm not sure what you two want to do... We cannot rename master. And I would propose that if we change to NF by default then the current "master" job should simply be "master". So just update it, get one large report with diffs.

One thing is the library testsuite, I'm fine with getting the diffs (= regression test results), that's the point. The other thing is the core testuite, that should have been ran with "-d=nonewInst", to avoid the diffs that prevent merging the PR. But I guess @perost managed already.

Then remove the newInst job and create an OF job in its place. Change daeMode to use -d=-newInst.

We don't need daeMode with OF any more. Who cares about that. Everything with NF, only master with OF (=nonewInst) for reference. See comment:8.

in reply to:  16 ; comment:18 by Francesco Casella, 4 years ago

Replying to perost:

The new frontend has now been enabled by default without changing any of the Jenkins jobs, so if anything breaks or catches fire we'll just have to deal with it.

@perost, did you start the library testing jobs yourself? We don't want to wait for a few days that they are started automatically

in reply to:  18 ; comment:19 by Per Östlund, 4 years ago

Replying to casella:

Replying to perost:

The new frontend has now been enabled by default without changing any of the Jenkins jobs, so if anything breaks or catches fire we'll just have to deal with it.

@perost, did you start the library testing jobs yourself? We don't want to wait for a few days that they are started automatically

No, I don't know if I even have permission to do that. I assumed the daily job would take care of it as usual and give us a report tonight or tomorrow morning.

in reply to:  19 comment:20 by Francesco Casella, 4 years ago

Replying to perost:

Replying to casella:

@perost, did you start the library testing jobs yourself? We don't want to wait for a few days that they are started automatically

No, I don't know if I even have permission to do that.

I used to do that on Hudson, because I understood how it worked. I'm not really sure I do with Jenkins, I'll see what I can do.

comment:21 by Francesco Casella, 4 years ago

comment:22 by Francesco Casella, 4 years ago

OK, we have the first results here: 906 improvements and 690 regressions. The overall balance is positive, 216 more models doing better, but the number of regressions is quite large.

Good to know, at least we know where we stand.

Many regressions are repeated several times, because they are present in different versions of the same library.

Some regressions them in older libraries are actually ok, because they are caused by invalid code that has been fixed in more recent versions. For example, ModelicaTest(3.2.1).Media.TestOnly.DryAirNasa fails because of forbidden lookup in partial Medium. The ModelicaTest 3.2.3 library only has 7 regressions, while 3.2.2 has 18 regressions and 3.2.1 has 25. As long as there are newer, backwards compatible libraries available, I think we shouldn't care.

One regression is real and very commonplace, and it is about division by zero at initialization, see e.g. IBPSA.Fluid.Examples.FlowSystem.Simplified1 or ModelicaTest.Fluid.Dissipation.Verifications.HeatTransfer.HeatExchanger.kc_flatTube_KC. I guess these should be investigated, and solving that issue should fix a lot of failures.

I'll try to categorize the most common issues here, so @perost can fix them.

@perost, some later commits broke further models, see https://libraries.openmodelica.org/branches/history/master/2020-12-15%2023:23:53..2020-12-17%2000:28:56.html, in particular they broke PNLib almost completely.

Maybe at some point we should switch back and forth once more, so we can see an updated regression test, before we decide what we want to put in 1.17.0.

in reply to:  22 ; comment:23 by Per Östlund, 4 years ago

I fixed some of the issues this week, though many of the regressions are due to old libraries that aren't standard compliant. But now I'm going on vacation, so nothing will happen for the next three weeks or so.

Replying to casella:

One regression is real and very commonplace, and it is about division by zero at initialization, see e.g. IBPSA.Fluid.Examples.FlowSystem.Simplified1 or ModelicaTest.Fluid.Dissipation.Verifications.HeatTransfer.HeatExchanger.kc_flatTube_KC. I guess these should be investigated, and solving that issue should fix a lot of failures.

I opened a ticket about it yesterday, see #6293. While the proper solution might take some time to implement it shouldn't be much work to implement the partial solution that the OF uses.

I'll try to categorize the most common issues here, so @perost can fix them.

@perost, some later commits broke further models, see https://libraries.openmodelica.org/branches/history/master/2020-12-15%2023:23:53..2020-12-17%2000:28:56.html, in particular they broke PNLib almost completely.

Already fixed, the coverage test just hasn't had time to run again.

in reply to:  23 ; comment:24 by Francesco Casella, 4 years ago

Replying to perost:

I fixed some of the issues this week, though many of the regressions are due to old libraries that aren't standard compliant.

Yeah. I'll see if I can sort them out.

But now I'm going on vacation, so nothing will happen for the next three weeks or so.

I already gave that for granted, thanks for your reply before you left :)

I opened a ticket about it yesterday, see #6293. While the proper solution might take some time to implement it shouldn't be much work to implement the partial solution that the OF uses.

Aha, I missed the connection, sorry. Sounds good.

Already fixed, the coverage test just hasn't had time to run again.

Very good. Enjoy your holidays!

in reply to:  24 comment:25 by Francesco Casella, 4 years ago

Replying to casella:

Already fixed, the coverage test just hasn't had time to run again.

Most of those regressions were fixed, but more popped up, see report.

I just opened #6302 on that.

comment:26 by Francesco Casella, 4 years ago

This report shows the effect of switching from OF to NF as of Jan 20 2021. Most regressions appear to be related to old libraries with invalid code (e.g. missing 'each'), though there are some residual issues, e.g. #6203, which is also part of the LBL DFD.

comment:27 by Francesco Casella, 4 years ago

Resolution: fixed
Status: newclosed

The testing infrastructure has also been updated, master now refers to NF and oldInst refers to the OF. We can go ahead with the decision.

Note: See TracTickets for help on using tickets.