HomeMesosphereNo notifications. 4 unresolved issues.

Exhaustive constraint migration
ClosedAll Users

Authored by ichernetsky on Aug 11 2017, 1:36 AM.

Details

Summary

The existing migration code doesn't take into account all the possible ways
constraints created using prior Marathon versions, can be invalid. This diff
fixes it.

Loosening CLUSTER constraint because it was improperly tightened;
a value-less cluster constraint causes all tasks to be placed on the same
host as the first task.

Test Plan

sbt test

Diff Detail

Repository
rMARATHON marathon
Branch
ic/1.3-constraint-migration
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 3281
Build 6260: Marathon (revised)Jenkins
Build 6259: arc lint + arc unit
ichernetsky created this revision.Aug 11 2017, 1:36 AM
jdef requested changes to this revision.Aug 11 2017, 3:17 PM
jdef added inline comments.
src/main/scala/mesosphere/marathon/state/Migration.scala
628

why the mixture of info/warn levels? shouldn't we just pick a level and be consistent?

631

latest docs say that CLUSTER doesn't require a value. code seems to agree: https://github.com/mesosphere/marathon/blob/releases/1.3/src/main/scala/mesosphere/mesos/Constraints.scala#L112

673

CLUSTER does not require a value

src/test/scala/mesosphere/marathon/state/MigrationTo_1_3_5Test.scala
53

CLUSTER does not require value

This revision now requires changes to proceed.Aug 11 2017, 3:17 PM
ichernetsky added inline comments.Aug 11 2017, 5:26 PM
src/main/scala/mesosphere/marathon/state/Migration.scala
628

I chose infofor constraints which need to be slightly adjusted (like removing an unused value). And I think warn is more appropriate when constraints get dropped, because they are invalid.

631

It depends on where to look for CLUSTER validation. In 1.3 it is considered to be invalid according to https://github.com/mesosphere/marathon/blob/releases/1.3/src/main/scala/mesosphere/marathon/state/AppDefinition.scala#L717 But I think this rule can be relaxed, so that a value is not required, as it is currently on 1.4 and master branches. What do you think?

673

Please refer to my response above.

src/test/scala/mesosphere/marathon/state/MigrationTo_1_3_5Test.scala
53

Please refer to my response above.

ichernetsky requested review of this revision.Aug 11 2017, 5:26 PM
jdef added inline comments.Aug 11 2017, 5:34 PM
src/main/scala/mesosphere/marathon/state/Migration.scala
628

let's pick one level and stick with it. I propose that any change to a user object be logged as a warning

631
ichernetsky updated this revision to Diff 3998.Aug 11 2017, 6:07 PM
  • Address @jdef's comments
ichernetsky updated this revision to Diff 3999.Aug 11 2017, 6:12 PM
  • CLUSTER constraint doesn't require a value
ichernetsky marked 4 inline comments as done.Aug 11 2017, 6:13 PM
kensipe requested changes to this revision.Aug 11 2017, 6:33 PM
kensipe added inline comments.
src/main/scala/mesosphere/marathon/state/Migration.scala
611

why are we turning format and scalastyle on and back off back to back? perhaps you are wanting to keep it as a function declaration but these macros are file level... we should remove lines 607-611

This revision now requires changes to proceed.Aug 11 2017, 6:33 PM
jdef added a comment.Aug 11 2017, 7:17 PM

also wondering .. for people that have already migrated to 1.4 ... when they migrate to 1.5 they won't go through the constraint sanitation here (because this patch is written against the 1.3.x branch).
should the changes in this patch for constraint sanitation in migration be forward-ported to 1.4.x in order to facilitate a smoother upgrade to 1.5? (if so, it implies that we want users to be on the latest, patched 1.4.x release before upgrading to 1.5 in order to pick up the constraint sanitization)

ichernetsky added inline comments.Aug 11 2017, 7:26 PM
src/main/scala/mesosphere/marathon/state/Migration.scala
611

I explicitly disable them where I need them disabled. They don't need to be disabled in between the methods. Especially, I would like to avoid a situation when a new function is added in between them later on without making sure that the effect of this format/scalatyle directives is limited to places in the code where they should be.

@jdef, as far as I can tell, upgrades skipping any version are not supported. This is what I heard from Matthias. @meichstedt, could you please clarify it here?

jdef added a comment.Aug 11 2017, 8:38 PM

I'm specifically talking about the case where a user has already upgraded to 1.4.x, prior to this change landing in 1.3.x; they won't see the sanitation this patch introduces because it targets 1.3, right?

That's correct. Users of 1.1 are supposed to upgrade to 1.3 first, before upgrading to 1.4 Users of 1.4 do not need this sanitation, because the constraint validation in it is much better.

zen-dog requested changes to this revision.Aug 15 2017, 1:38 PM

We shouldn't relax the CLUSTER constraint in this diff. Otherwise lgtm

src/main/scala/mesosphere/marathon/state/AppDefinition.scala
716

I don't think we should mix migration with relaxation of the CLUSTER constraint. Especially since 1.4 still require CLUSTER to have a value.
Do it in an extra diff if you want.

ichernetsky added inline comments.Aug 15 2017, 5:39 PM
src/main/scala/mesosphere/marathon/state/AppDefinition.scala
716

Good point. I can adjust the 1.4 in this regard, but on the other hand, I won't be always the case that a user upgrades the latest 1.3 version to the latest 1.4. Therefore I should bring the CLUSTER migration logic that was there initially. What do you think, @jdef?

zen-dog added inline comments.Aug 15 2017, 6:20 PM
src/main/scala/mesosphere/marathon/state/AppDefinition.scala
716

I remember our upgrade policy being currently to upgrade from 1.3.x to 1.4x to 1.5.. where x is always the latest one. I think relaxing CLUSTER constraints should be done in 1.4 and backported to 1.3 in a separate effort.

ichernetsky requested review of this revision.Aug 15 2017, 7:52 PM
ichernetsky marked an inline comment as done.
ichernetsky added inline comments.
src/main/scala/mesosphere/marathon/state/AppDefinition.scala
716

Lgtm provided you revert the changes that are done in D986.

src/main/scala/mesosphere/marathon/state/AppDefinition.scala
716

Cool. Revert this change here then?

src/test/scala/mesosphere/marathon/api/validation/RunSpecValidatorTest.scala
575

... and here?

ichernetsky marked an inline comment as done.Aug 16 2017, 8:00 PM
ichernetsky added inline comments.
src/main/scala/mesosphere/marathon/state/AppDefinition.scala
716

I am not sure why, if the changes in D986 are for 1.4, and these changes are for 1.3.

zen-dog added inline comments.Aug 17 2017, 10:30 AM
src/main/scala/mesosphere/marathon/state/AppDefinition.scala
716

Because this diff conflates 2 independent problems (migration and CLUSTER constraint) and the second one is not even mentioned in the diff description. I know it's more work this way but one diff should solve one problem.

ichernetsky added inline comments.Aug 21 2017, 8:26 PM
src/main/scala/mesosphere/marathon/state/AppDefinition.scala
716

I absolutely disagree that they are independent, because if not relaxing the constraint on CLUSTER,there should be additional clauses in the corresponding function to do CLUSTER constraint migration. Which means, that even if another diff to relax the constraint is created, the migration function would have to be adjusted too to take into account the constraint relaxation. And this proves that they are not independent. In addition to that it will require re-testing of the current diff, and testing a new diff, => at least 2 hours of additional work for no obvious to me reason.

zen-dog accepted this revision.Aug 22 2017, 12:05 PM
zen-dog added inline comments.
src/main/scala/mesosphere/marathon/state/AppDefinition.scala
716

I see. Ok then.

jdef resigned from this revision.Aug 23 2017, 9:30 PM
This revision is now accepted and ready to land.Aug 23 2017, 10:32 PM
timcharper added inline comments.
src/main/scala/mesosphere/marathon/state/AppDefinition.scala
716

It isn't clear to me at all why we are relaxing this constraint; why not just solve it with a migration?

src/main/scala/mesosphere/marathon/state/Migration.scala
594

How do we currently interpret "*"?

611

I question the value of the cyclomatic complexity checker; it is too sensitive for functions with large pattern matching statements

timcharper accepted this revision.Aug 23 2017, 11:01 PM
timcharper added inline comments.
src/main/scala/mesosphere/marathon/state/AppDefinition.scala
716
/Users/tim/src/m8e/marathon-1.4/src/main/scala/mesosphere/mesos/Constraints.scala

  97 |         case Operator.CLUSTER =>
  98 |           // Hostname must match or be empty
  99 |           (value.isEmpty || value == offer.getHostname) &&
 100 |             // All running tasks must have the same hostname as the one in the offer
 101 |             allPlaced.forall(_.hostname == offer.getHostname)
 102 |         case _ => false

Oh geez it actually did have a meaning!!! 😨

ichernetsky edited the summary of this revision. (Show Details)Aug 23 2017, 11:05 PM
ichernetsky added inline comments.Aug 23 2017, 11:09 PM
src/main/scala/mesosphere/marathon/state/AppDefinition.scala
716

Yeah, it works this way in all Marathon versions starting from 1.3 onwards

src/main/scala/mesosphere/marathon/state/Migration.scala
594

As an improper value. LIKE and UNLIKE operators must have regexes as their values.

ichernetsky closed this revision.Aug 23 2017, 11:25 PM