HomeMesosphereNo notifications. 4 unresolved issues.

Do not kill terminated tasks of reserved instances.
AbandonedAll Users

Authored by jeschkies on Jul 14 2017, 3:39 PM.

Details

Summary

The clean up for ResidentTaskIntegrationTest would fail sometimes
because Marathon would not free all reserved resources. It seems the
code ended up in a loop trying to delete an already failed task.
Resident tasks can be in a failed state when they are restarted to fast.

This change also removes the KillStreamWatcher since KillServiceActor is
already watching the event bus for state changes.

Test Plan

pipeline, loop is coming up.

Diff Detail

Repository
rMARATHON marathon
Branch
karsten/resident-task-cleanup
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 3196
Build 6095: Marathon (revised)Jenkins
Build 6094: arc lint + arc unit

Unit TestsFailed

TimeTest
297,973 msmesosphere.marathon.integration.ResidentTaskIntegrationTest::ResidentTaskIntegrationTest should Scale Down
org.scalatest.exceptions.TestFailedException: 10 was not equal to 5
306,671 msmesosphere.marathon.integration.DeleteAppAndBackupIntegrationTest::(not a test)
org.scalatest.concurrent.Futures$FutureConcept$$anon$1: A timeout occurred waiting for a future to complete. Queried 19692 times, sleeping 15 milliseconds between each query.
305,144 msmesosphere.marathon.integration.GroupDeployIntegrationTest::(not a test)
org.scalatest.concurrent.Futures$FutureConcept$$anon$1: A timeout occurred waiting for a future to complete. Queried 19340 times, sleeping 15 milliseconds between each query.
685,665 msmesosphere.marathon.integration.ResidentTaskIntegrationTest::(not a test)
org.scalatest.exceptions.TestFailedDueToTimeoutException: The code passed to eventually never returned normally. Attempted 37 times over 30.20466503 seconds. Last failure message: clean slate in Mesos not satisfied.
460 msmesosphere.marathon.integration.AppDeployIntegrationTest::AppDeploy should Docker info is not automagically created
View Full Test Results (1 Failed · 3 Broken · 90 Passed)
Changes from before your most recent comment are hidden. Show Older Changes
jeschkies updated this revision to Diff 3888.Aug 1 2017, 1:06 PM
  • Rebase
jenkins requested changes to this revision.Aug 1 2017, 1:09 PM
This revision now requires changes to proceed.Aug 1 2017, 1:09 PM
✗ Build of 3888 failed jenkins-public-marathon-phabricator-628.

Error message:

Stage Compile and Test failed.

(๑′°︿°๑)

jeschkies abandoned this revision.Aug 17 2017, 4:50 PM

The issue is most probably in the revive actor for offers that should throw out reservations. See JIRA MARATHON-7338 ResidentTaskIntegrationTest Clean Up Is Stuck.