I made these changes while working on https://phabricator.mesosphere.com/D263 . It’s an attempt to clarify the relationship between Condition and MesosTaskState, and use the type system more to describe this relationship.
Details
Diff Detail
- Repository
- rMARATHON marathon
- Lint
Automatic diff as part of commit; lint not applicable. - Unit
Automatic diff as part of commit; unit tests not applicable.
Nice work! It's much cleaner. Overall we should have a different type for task states and instance conditions.
Also, a few weeks ago I've created a state machine to illustrate the state of an instance https://github.com/mesosphere/marathon/blob/master/task_status_fsm.svg. This might illustrate the terminal states better.
src/main/scala/mesosphere/marathon/core/condition/Condition.scala | ||
---|---|---|
90 | Funny, it should actually be the other around. UnreachableInactive is lost and Unreachable is not. Then again it depends on how we interpret lost and what actions we take. Ideally we don't use the term at all any more since there used to be a TASK_LOST state which is gone with Mesos 1.1 and Marathon 1.4. | |
91–92 | This is also a terminal state. | |
92–116 | This is a terminal state. | |
src/main/scala/mesosphere/marathon/core/task/termination/impl/InstanceChangedPredicates.scala | ||
7–12 ↗ | (On Diff #1039) | Why not use just Terminal? We seem to be inconsistent here. |
src/main/scala/mesosphere/marathon/core/condition/Condition.scala | ||
---|---|---|
18 | I'd rather not have this as a member of condition after a second thought. Having it defined here implies that there's a correct conversion from condition to String. However, as discussed, this should be evaluated based on the given context (publishing an event might be different from serializing via the API e.g.) Even more since this def has type String; it's not clear about the intended usage. D263 introduces a conversion specific to the InstanceChangedEventsGenerator for that use case. I see two remaining usages of toReadableName: Formats (TaskWrites) and PodInstanceConversion. Maybe you can change these to specific conversions as well? WDYT about a Map[Condition->TaskState] or Map[Condition->String]? (It's easier to forget about added condition, but we can work around that by testing via pattern matches.) | |
90 | I agree with @jeschkies: Please don't use the term lost – and I think an extra trait is not needed. The semantics of Unreachable and UnreachableInactive depend on the context as well: when scaling, Unreachable is still considered Active while instances in UnreachableInactive are subject to be replaced. When performing health checks or readiness checks, both will be skipped. | |
src/main/scala/mesosphere/marathon/core/task/termination/impl/InstanceChangedPredicates.scala | ||
7–12 ↗ | (On Diff #1039) | The KillService is used e.g. when stopping an application. If we don't consider Unreachable and Reserved as terminal in this context, a StopApplication deployment will not finish for a long time. @timcharper can you add a comment to this def explaining why we don't use Terminal or Active? |
src/main/scala/mesosphere/marathon/state/TaskFailure.scala | ||
124–138 | Note the TODO in HistoryActor: that currently extracts TaskFailures from both MesosTaskStatusUpdateEvent and InstanceChanged. I'd like to get rid of internal MesosTaskStatusUpdateEvent consumers, but I'm also not a big fan of condition.mesosTaskState. Maybe we need an optional trigger in InstanceChanged like we have in InstanceChangedEventsGenerator? That'd be a bigger change though. |
Build has FAILED
Link to build: https://jenkins.mesosphere.com/service/jenkins/job/public-test-marathon-phabricator/510/
See console output for more information: https://jenkins.mesosphere.com/service/jenkins/job/public-test-marathon-phabricator/510/console
Build has FAILED
Link to build: https://jenkins.mesosphere.com/service/jenkins/job/public-test-marathon-phabricator/511/
See console output for more information: https://jenkins.mesosphere.com/service/jenkins/job/public-test-marathon-phabricator/511/console
src/main/scala/mesosphere/marathon/core/task/termination/impl/InstanceChangedPredicates.scala | ||
---|---|---|
13–14 ↗ | (On Diff #1080) | Could you create issues for these TODOs? I'm afraid we forget otherwise. |
src/main/scala/mesosphere/marathon/core/condition/Condition.scala | ||
---|---|---|
18 | I don’t believe we should implement one for Condition -> String since toReadableName has been expected to return a mesos.TaskState string all along. https://github.com/mesosphere/marathon/blob/2cda52a3cc9735a735755910ed9222263c108884/src/main/scala/mesosphere/marathon/state/TaskFailure.scala#L119 |
src/main/scala/mesosphere/marathon/core/condition/Condition.scala | ||
---|---|---|
92–116 | Failure implies Terminal. |
src/main/scala/mesosphere/marathon/core/task/termination/impl/InstanceChangedPredicates.scala | ||
---|---|---|
13–14 ↗ | (On Diff #1080) | I will |
Build has FAILED
Link to build: https://jenkins.mesosphere.com/service/jenkins/job/public-test-marathon-phabricator/549/
See console output for more information: https://jenkins.mesosphere.com/service/jenkins/job/public-test-marathon-phabricator/549/console
Github issue for Condition suspended TODO recorded: https://github.com/mesosphere/marathon/issues/4789
Build is green https://jenkins.mesosphere.com/service/jenkins/job/public-test-marathon-phabricator/552/ for more details.
@timcharper Thanks! One question regarding a comment - otherwise LGTM.
src/main/scala/mesosphere/marathon/core/instance/update/InstanceChangedEventsGenerator.scala | ||
---|---|---|
40 | Is this a todo, left over or clarification? | |
src/main/scala/mesosphere/marathon/state/TaskFailure.scala | ||
124–138 | @meichstedt not part of this PR: Since we emit InstanceChange and MesosStatusUpdateEvent for a failing task, the HistoryActor will store the same error twice, or? |
src/main/scala/mesosphere/marathon/core/condition/Condition.scala | ||
---|---|---|
92–116 | Ah, missed that. |
Thanks!
src/main/scala/mesosphere/marathon/state/TaskFailure.scala | ||
---|---|---|
124–138 | @aquamatthias yes, I'm working on cleaning this up |
src/main/scala/mesosphere/marathon/core/instance/update/InstanceChangedEventsGenerator.scala | ||
---|---|---|
40 | clarification |
I'd rather not have this as a member of condition after a second thought. Having it defined here implies that there's a correct conversion from condition to String. However, as discussed, this should be evaluated based on the given context (publishing an event might be different from serializing via the API e.g.)
Even more since this def has type String; it's not clear about the intended usage.
D263 introduces a conversion specific to the InstanceChangedEventsGenerator for that use case. I see two remaining usages of toReadableName: Formats (TaskWrites) and PodInstanceConversion. Maybe you can change these to specific conversions as well? WDYT about a Map[Condition->TaskState] or Map[Condition->String]? (It's easier to forget about added condition, but we can work around that by testing via pattern matches.)