HomeMesosphereNo notifications. 4 unresolved issues.

Consider Reserved instances for validating constraints.
ClosedAll Users

Authored by meichstedt on Aug 18 2017, 7:10 PM.

Details

Summary

When matching offers for resident tasks, Marathon would not consider instances in state Reserved when validating constraints. This potentially resulted in multiple reservations on the same agent, even if a hostname: unique constraint was applied to the run spec. The problem boils down to the fact that the ResourceMatcher filtered for instance.isLaunched and only considered those instances. This was accidentally introduced via the fact that previously, instances which weren't launched had no runSpecVersion and therefore could not be compared to the last config change.

Compare 3bf03f86, which needed to access _.launched to compare the version. Deriving the need to check for isLaunched later was a wrong assumption.

While at it, StrictLogging has been introduced and unnecessary if(xyzEnabled) statements have been removed.

Added two unit tests for offer matching with unique constraints

Test Plan

sbt test

Diff Detail

Repository
rMARATHON marathon
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
meichstedt created this revision.Aug 18 2017, 7:10 PM
jenkins requested changes to this revision.Aug 18 2017, 7:14 PM
This revision now requires changes to proceed.Aug 18 2017, 7:14 PM
meichstedt edited the summary of this revision. (Show Details)Aug 18 2017, 7:17 PM
kensipe accepted this revision.Aug 18 2017, 7:21 PM

wow... if we take away the logger updates and the variable name change... It looks like this is a 1 liner. nice work!
Waiting for tests... code lgtm

meichstedt updated this revision to Diff 4073.Aug 18 2017, 7:24 PM

clarifying test names

jenkins requested changes to this revision.Aug 18 2017, 7:25 PM
This revision now requires changes to proceed.Aug 18 2017, 7:25 PM
✗ Build of 4072 failed jenkins-public-marathon-phabricator-813.

Error message:

Stage Compile and Test failed.

(๑′°︿°๑)

meichstedt updated this revision to Diff 4074.Aug 18 2017, 7:26 PM

clarifying test names

jenkins requested changes to this revision.Aug 18 2017, 7:30 PM
This revision now requires changes to proceed.Aug 18 2017, 7:30 PM
✗ Build of 4074 failed jenkins-public-marathon-phabricator-815.

Error message:

Stage Compile and Test failed.

(๑′°︿°๑)

✗ Build of 4073 failed jenkins-public-marathon-phabricator-814.

Error message:

Stage Compile and Test failed.

(๑′°︿°๑)

✗ Build of 4074 failed jenkins-public-marathon-phabricator-817.

Error message:

Stage Compile and Test failed.

(๑′°︿°๑)

mesosphere.marathon.integration.ResidentTaskIntegrationTest.ResidentTaskIntegrationTest should resident task can be deployed along with constraints

might actually be a regression, need to investigate.

✗ Build of 4074 failed jenkins-public-marathon-phabricator-818.

Error message:

Stage Compile and Test failed.

(๑′°︿°๑)

meichstedt updated this revision to Diff 4076.Aug 18 2017, 10:36 PM

fixed an embarrassing bug that has been there for ages.

jenkins requested changes to this revision.Aug 18 2017, 10:44 PM
This revision now requires changes to proceed.Aug 18 2017, 10:44 PM
✗ Build of 4074 failed jenkins-public-marathon-phabricator-819.

Error message:

Stage Compile and Test failed.

(๑′°︿°๑)

✗ Build of 4076 failed jenkins-public-marathon-phabricator-820.

Error message:

Stage Compile and Test failed.

(๑′°︿°๑)

jenkins accepted this revision.Aug 19 2017, 1:27 AM
This revision is now accepted and ready to land.Aug 19 2017, 1:27 AM
This revision is now accepted and ready to land.Aug 19 2017, 1:27 AM
✔ Build of 4076 completed jenkins-public-marathon-phabricator-821.

You can create a DC/OS with your patched Marathon by creating a new pull
request with the following changes in buildinfo.json:

\\ ٩( ᐛ )و //

timcharper accepted this revision.Aug 21 2017, 10:32 AM
timcharper added inline comments.Aug 21 2017, 10:34 AM
src/main/scala/mesosphere/mesos/ResourceMatcher.scala
179

wondering why did we have the .isLaunched filter here before?

This revision was automatically updated to reflect the committed changes.

I've confirmed locally that this fixes the bug