"Simplify" means here basically two things: 1) reduce the number of states and their transitions, and 2) fail-fast instead of doing complex state-machine logic
Details
- Reviewers
timcharper meichstedt jeschkies zen-dog kensipe unterstein jdef jenkins - Commits
- rMARATHONc85639f12298: Simplify leader election code
rMARATHON8cd913e203f4: Update a few comments per Aleksey's request
rMARATHON0ebe9270cd0f: Simplify leader election code
rMARATHON6b2c45b12adb: Simplify leader election code
rMARATHONa67a7e44339d: Simplify leader election code - JIRA Issues
- JIRA MARATHON-4083 Provide and use crashing strategy interface
JIRA MARATHON-2008 Simplify leadership election code
Manual testing of various failure scenarious + run all kinds of automated testing
Diff Detail
- Repository
- rMARATHON marathon
- Lint
Automatic diff as part of commit; lint not applicable. - Unit
Automatic diff as part of commit; unit tests not applicable.
@zen-dog I think we might have some further simplification potential here; what would you think if we continued to boil this down to it's essence before embarking on effort to fix? @ichernetsky and I are going to talk on Tuesday
Also, what do you think if we land the patch as is? There is substantial improvement already and the history for this one is getting quite long.
I support merging this is and any additional enhancement coming from another PR.
Long review and lgtm
@zen-dog, I added a few words to ElectionService's Scaladoc comment. And since we don't have an (explicit) FSM anymore, I think there is not much to add at this point.
@timcharper, I've restored CuratorElectionServiceTest. Previously most of the state transitions/method invocations were tested using ElectionServiceBase. Now a real ZK is required to do so, which means we need to have integration tests to achieve that, I believe. Partially, the test coverage went down because of that. The other reason is the total number of state transitions got reduced.
Error message:
Stage Compile and Test failed.
(๑′°︿°๑)
Could you please go through https://jira.mesosphere.com/browse/MARATHON_EE-1542 and verify that that condition is not possible after your fix? I know writing a test would probably be very hard so a "thought experiment" would suffice. Thx.
@zen-dog, I will do it, but is it somehow connected to the fact that you kept "reject" to this patch?
Checking for that case is wise. I haven't looked into it too deeply and am not currently sure if there is a way we can make more obvious the case mentioned by Aleksey, or what's been explored in that area. But having an explanation for why it does not affect the post-refactor here is good. Pending that, I approve! Great work!
I have to say that the logic looks much cleaner and I like the fact that it's flat now. I have a couple of questions/nits/better docs request but nothing that would justify a reject.
However before you merge please:
- create a jenkins loop (copy of the existing ones, edit and put your branch) and let it run for at least 24h - I'd like to make sure we're not adding to existing flakiness
after the merge we should soak test this:
- talk to Armand/Viktor so we can soak-test an according snapshot (will probably require to merge it to dcos-ee)
- you should also talk to our chaos-engineering team so they run the chaos-tests with this snapshot - last bug in re-election code was found by them
Thx again for great patch!
src/main/scala/mesosphere/marathon/MarathonSchedulerService.scala | ||
---|---|---|
235 | I don't think this comment still makes sense since we removed the extra abdication parameters. | |
src/main/scala/mesosphere/marathon/core/election/ElectionService.scala | ||
21 | This is a good start for a class doc :thumbup: However I would be also interested in the complete chain of events e.g.:
I know that "the code is self-explanatory" and "why dont you just read the code" but having full chain of events, concurrency model and guaranties explained goes a long way. Sry if it feels like I'm the pain in the neck :) | |
src/main/scala/mesosphere/marathon/core/election/impl/CuratorElectionService.scala | ||
106 | So if we were offered leadership twice in a short period of time, neither will succeed and marathon will exit? Can't this lead to a scenario where marathon gets too many leadership offers but is not able to act on them successfully? |
Regarding a Jenkins loop job, soaking it and probably getting in touch with the Chaos engineering team — it all makes sense and I will do it. Thanks for the review.
src/main/scala/mesosphere/marathon/MarathonSchedulerService.scala | ||
---|---|---|
235 | I think it still makes sense because it is more about leadership abdication, not the method's parameters. It is more about the fact that driver.foreach is used, because it can be set to None in stopDriver() before this line gets to execute. What do you think? | |
src/main/scala/mesosphere/marathon/core/election/ElectionService.scala | ||
21 | where do ZK/CuratorService events come in? — I don't get what you mean here. | |
src/main/scala/mesosphere/marathon/core/election/impl/CuratorElectionService.scala | ||
106 | If a leadership is offered more than once, then it is a (I would say, severe) bug in the code. As of now, only MarathonScheculerService invokes this method, and surely it is supposed to do so only once during start-up. Therefore, stopping is a proper action here, I deem. |
Error message:
Stage Compile and Test failed.
(๑′°︿°๑)
Make tests stable again on Amazon instances by getting rid of intercepting calls to shutdown JVM
Error message:
Stage Compile and Test failed.
(๑′°︿°๑)
Thanks for introducing crashing strategy; does it make sense to link the patch to the issue? Great work, Ivan!
src/main/scala/mesosphere/marathon/core/election/impl/ElectionServiceMetrics.scala | ||
---|---|---|
23 | I commented in a ticket that reporting leader metrics for non-ha mode is misleading. But, it matches old behavior and I think this is great for now. Let's move forward. |
Error message:
Stage Compile and Test failed.
(๑′°︿°๑)
Error message:
Stage Compile and Test failed.
(๑′°︿°๑)
You can create a DC/OS with your patched Marathon by creating a new pull
request with the following changes in buildinfo.json:
"url": "https://downloads.mesosphere.io/marathon/snapshots/marathon-1.5.0-SNAPSHOT-643-gf8e47a6.tgz", "sha1": "32a6184e86fb67c2956629e1508ff3dc8096e0a0"
\\ ٩( ᐛ )و //
src/main/scala/mesosphere/marathon/core/election/impl/PseudoElectionService.scala | ||
---|---|---|
25–26 | Hm, it seems this class shares a lot of logic with CuratorElectionService. How are they different? Is there a way to share some logic? | |
src/test/scala/mesosphere/marathon/core/election/impl/PseudoElectionServiceTest.scala | ||
33 | It seems we are not testing CuratorElectionService. Is this impression correct? | |
63–64 | Sweet! |
src/main/scala/mesosphere/marathon/core/election/impl/PseudoElectionService.scala | ||
---|---|---|
25–26 | I see a vicious circle right there. I feel like I can write a book about trade-offs when implementing leader election which relies on an external service like ZooKeeper. :) This is exactly what I was asked for from the very beginning: get rid of ElectionServiceBase. My first step was to simplify things, and reduce the number of states and their transitions in ElectionServiceBase, which lead to to renaming of it to ElectionServiceFSM because it started to look like a real FSM. It wasn't enough, and it led to this further simplification at the expense of having this duplicated code. It seems we are headed towards Akka Streams. This code duplication won't be there anymore after another round of refactoring of this code. On the other hand, if one imagines a two-level inheritance tree with one parent and multiple children as a horizontal thing as a apposed to a vertical thing, where the parent make calls into children and vice versa, it can be seen very well as a message-passing with compile-time checks of it. Long story short, it is what it is. The decision was made, it is too late to change it drastically at this point. Though I do see that
we should keep this intermediate approach, move on, and have it refactored to make it leverage Akka Streams later, after 1.10 release. | |
src/test/scala/mesosphere/marathon/core/election/impl/PseudoElectionServiceTest.scala | ||
33 | It is correct. Please refer to my comment to your first one. Most of it was tested using ElectionServiceBase/FSM before, but since most of the code is not shared between CuratorElectionService and PseudoElectionService, it is gone. Now testing of it would require running ZK, which looks more like SI testing at least. |
build.sbt | ||
---|---|---|
22 | seems unrelated to leader election. why is this change here? | |
src/main/scala/mesosphere/marathon/MarathonSchedulerActor.scala | ||
442 | why is this linter rule needed? | |
src/main/scala/mesosphere/marathon/core/base/CrashStrategy.scala | ||
10 | why not a sealed trait here? | |
14 | case object (if we use sealed trait above)? | |
src/main/scala/mesosphere/marathon/core/election/impl/CuratorElectionService.scala | ||
52 | not that it's related to this change (because this just looks like a rename) but what's the impact of using a separate execution context here, vs. the "global" one that has support for the injected Context stuff Jason wrote? | |
55 | leadershipOffered and acquiringLeadership function more like single-use latches. it might make sense to either (a) document that, or else; (b) wrap with a helper type that only allows a single-use one-way transition: package mesosphere.marathon; package core.async; final class Latch { private[this] val counter = new Semaphore(1) def acquire: Option[Done] = // use of Option return type forces pattern matches to deal w/ all alternatives, likely highly desirable if(counter.tryAcquire()) Some(Done) else None } I like the idea of limiting transitions via a Latch type because it enforces the design as intended | |
145 | nit: extract constant | |
220–231 | looks like the ACLs changed here. why is this part of this PR and not a separate changeset? seems significant enough? | |
233 | nit: extract constants | |
src/main/scala/mesosphere/marathon/core/election/impl/ElectionServiceMetrics.scala | ||
17 | is it possible that startMetrics and stopMetrics could be called concurrently? if so, the atomic boolean guard won't help you avoid a data race. if not, then i'd like to see that assumption documented someplace | |
src/main/scala/mesosphere/marathon/core/election/impl/PseudoElectionService.scala | ||
45 | ditto re: Latch suggestion | |
86 | nit: extract constant |
build.sbt | ||
---|---|---|
22 | This patch has been in review for 3 months. And I address compiler warning every time I merge in origin/master. | |
src/main/scala/mesosphere/marathon/MarathonSchedulerActor.scala | ||
442 | [warn] /Users/ivanchernestsky/dev/marathon/src/main/scala/mesosphere/marathon/MarathonSchedulerActor.scala:442: [UseIfExpression] Assign the result of the if expression to variable ifres$macro$26 directly. [warn] if (knownTaskStatuses.nonEmpty) driver.reconcileTasks(knownTaskStatuses.asJavaCollection) | |
src/main/scala/mesosphere/marathon/core/election/impl/CuratorElectionService.scala | ||
52 | As far as I understand, it was done to let Curator be blocked in a separate thread, not in the one of https://github.com/mesosphere/marathon/blob/master/src/main/scala/mesosphere/marathon/core/async/ExecutionContexts.scala#L50 | |
55 | I don't think it'd better to replace those two variables with one Latch (if you meant this) because they used in two different places. It is correct that both of them are one-way (from false to true). If you meant with replacing them with two Latches, I think it is better not to do it at this point (frankly, it is quite late), because it will have to be full re-tested. | |
220–231 | I noticed that the ACLs were incorrect a few days after I started working on the leader election simplification, and at that point I was new to the dev workflow here, and it seemed that I would take just a few more days to land this patch. | |
src/main/scala/mesosphere/marathon/core/election/impl/ElectionServiceMetrics.scala | ||
17 | It is done to avoid calling stopMetrics if startMetrics was not invoked before, or call stopMetrics twice (on both leader abdication and JVM shutdown). |
Error message:
Stage Compile and Test failed.
(๑′°︿°๑)
You can create a DC/OS with your patched Marathon by creating a new pull
request with the following changes in buildinfo.json:
"url": "https://downloads.mesosphere.io/marathon/snapshots/marathon-1.5.0-SNAPSHOT-658-g4cf8c34.tgz", "sha1": "a7091bc22ae6be5288d66c5954ecf1a085619d32"
seems unrelated to leader election. why is this change here?