HomeMesosphereNo notifications. 4 unresolved issues.

Simplify leader election code
ClosedAll Users

Authored by ichernetsky on Apr 26 2017, 6:34 PM.

Diff Detail

Repository
rMARATHON marathon
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
There are a very large number of changes, so older changes are hidden. Show Older Changes

@zen-dog I think we might have some further simplification potential here; what would you think if we continued to boil this down to it's essence before embarking on effort to fix? @ichernetsky and I are going to talk on Tuesday

Also, what do you think if we land the patch as is? There is substantial improvement already and the history for this one is getting quite long.

(oh, except for our test coverage went down significantly :) we should address this)

kensipe accepted this revision.Jun 27 2017, 5:19 PM

I support merging this is and any additional enhancement coming from another PR.
Long review and lgtm

ichernetsky planned changes to this revision.Jun 27 2017, 8:00 PM

@zen-dog, I added a few words to ElectionService's Scaladoc comment. And since we don't have an (explicit) FSM anymore, I think there is not much to add at this point.

@timcharper, I've restored CuratorElectionServiceTest. Previously most of the state transitions/method invocations were tested using ElectionServiceBase. Now a real ZK is required to do so, which means we need to have integration tests to achieve that, I believe. Partially, the test coverage went down because of that. The other reason is the total number of state transitions got reduced.

ichernetsky updated this revision to Diff 3641.Jun 27 2017, 8:00 PM
  • Restore CuratorElectionServiceTest
  • Expand ElectionService's Scaladoc comment
jenkins requested changes to this revision.Jun 27 2017, 8:20 PM
This revision now requires changes to proceed.Jun 27 2017, 8:20 PM
✗ Build of 3641 failed jenkins-public-marathon-phabricator-362.

Error message:

Stage Compile and Test failed.

(๑′°︿°๑)

Could you please go through https://jira.mesosphere.com/browse/MARATHON_EE-1542 and verify that that condition is not possible after your fix? I know writing a test would probably be very hard so a "thought experiment" would suffice. Thx.

@zen-dog, I will do it, but is it somehow connected to the fact that you kept "reject" to this patch?

timcharper accepted this revision.Jun 29 2017, 10:49 PM

Checking for that case is wise. I haven't looked into it too deeply and am not currently sure if there is a way we can make more obvious the case mentioned by Aleksey, or what's been explored in that area. But having an explanation for why it does not affect the post-refactor here is good. Pending that, I approve! Great work!

I have to say that the logic looks much cleaner and I like the fact that it's flat now. I have a couple of questions/nits/better docs request but nothing that would justify a reject.
However before you merge please:

  • create a jenkins loop (copy of the existing ones, edit and put your branch) and let it run for at least 24h - I'd like to make sure we're not adding to existing flakiness

after the merge we should soak test this:

  • talk to Armand/Viktor so we can soak-test an according snapshot (will probably require to merge it to dcos-ee)
  • you should also talk to our chaos-engineering team so they run the chaos-tests with this snapshot - last bug in re-election code was found by them

Thx again for great patch!

src/main/scala/mesosphere/marathon/MarathonSchedulerService.scala
235

I don't think this comment still makes sense since we removed the extra abdication parameters.

src/main/scala/mesosphere/marathon/core/election/ElectionService.scala
21

This is a good start for a class doc :thumbup: However I would be also interested in the complete chain of events e.g.:

  • -> where do ZK/CuratorService events come in?
  • -> what happens after we're elected? What events are sent through the system?
  • -> what parts of the chain happens synchronously vs. asynchronously and can happen parallel to other events?

I know that "the code is self-explanatory" and "why dont you just read the code" but having full chain of events, concurrency model and guaranties explained goes a long way. Sry if it feels like I'm the pain in the neck :)

src/main/scala/mesosphere/marathon/core/election/impl/CuratorElectionService.scala
106

So if we were offered leadership twice in a short period of time, neither will succeed and marathon will exit? Can't this lead to a scenario where marathon gets too many leadership offers but is not able to act on them successfully?

zen-dog accepted this revision.Jul 4 2017, 4:14 PM
ichernetsky marked 3 inline comments as done.Jul 6 2017, 8:44 PM

Regarding a Jenkins loop job, soaking it and probably getting in touch with the Chaos engineering team — it all makes sense and I will do it. Thanks for the review.

src/main/scala/mesosphere/marathon/MarathonSchedulerService.scala
235

I think it still makes sense because it is more about leadership abdication, not the method's parameters. It is more about the fact that driver.foreach is used, because it can be set to None in stopDriver() before this line gets to execute. What do you think?

src/main/scala/mesosphere/marathon/core/election/ElectionService.scala
21

where do ZK/CuratorService events come in? — I don't get what you mean here.
what happens after we're elected? — it is already documented in both ElectionService and ElectionCandidate.
what events are sent through the system? — done.
what parts of the chain happens synchronously vs. asynchronously and can happen parallel to other events? — All methods are synchronous which should be expected by default because no future is returned and no callback is accepted in any method.

src/main/scala/mesosphere/marathon/core/election/impl/CuratorElectionService.scala
106

If a leadership is offered more than once, then it is a (I would say, severe) bug in the code. As of now, only MarathonScheculerService invokes this method, and surely it is supposed to do so only once during start-up. Therefore, stopping is a proper action here, I deem.

ichernetsky updated this revision to Diff 3723.Jul 6 2017, 8:45 PM
ichernetsky marked 3 inline comments as done.
  • Update a few comments per Aleksey's request
jenkins requested changes to this revision.Jul 6 2017, 8:48 PM
This revision now requires changes to proceed.Jul 6 2017, 8:48 PM
✗ Build of 3723 failed jenkins-public-marathon-phabricator-443.

Error message:

Stage Compile and Test failed.

(๑′°︿°๑)

ichernetsky updated this revision to Diff 3763.Jul 12 2017, 6:28 PM

Make tests stable again on Amazon instances by getting rid of intercepting calls to shutdown JVM

jenkins requested changes to this revision.Jul 12 2017, 6:28 PM
This revision now requires changes to proceed.Jul 12 2017, 6:28 PM
✗ Build of 3763 failed jenkins-public-marathon-phabricator-487.

Error message:

Stage Compile and Test failed.

(๑′°︿°๑)

Thanks for introducing crashing strategy; does it make sense to link the patch to the issue? Great work, Ivan!

src/main/scala/mesosphere/marathon/core/election/impl/ElectionServiceMetrics.scala
23

I commented in a ticket that reporting leader metrics for non-ha mode is misleading. But, it matches old behavior and I think this is great for now. Let's move forward.

timcharper accepted this revision.Jul 12 2017, 8:21 PM
ichernetsky added 1 JIRA issue(s): MARATHON-4083.Jul 12 2017, 8:22 PM

Linked it to the Jira issue. Thanks for the quick review, Tim.

✗ Build of 3763 failed jenkins-public-marathon-phabricator-490.

Error message:

Stage Compile and Test failed.

(๑′°︿°๑)

✗ Build of 3763 failed jenkins-public-marathon-phabricator-491.

Error message:

Stage Compile and Test failed.

(๑′°︿°๑)

jenkins accepted this revision.Jul 12 2017, 10:17 PM
This revision is now accepted and ready to land.Jul 12 2017, 10:17 PM
✔ Build of 3763 completed jenkins-public-marathon-phabricator-492.

You can create a DC/OS with your patched Marathon by creating a new pull
request with the following changes in buildinfo.json:

"url": "https://downloads.mesosphere.io/marathon/snapshots/marathon-1.5.0-SNAPSHOT-643-gf8e47a6.tgz",
"sha1": "32a6184e86fb67c2956629e1508ff3dc8096e0a0"

\\ ٩( ᐛ )و //

jeschkies added inline comments.Jul 13 2017, 11:28 AM
src/main/scala/mesosphere/marathon/core/election/impl/PseudoElectionService.scala
25–26

Hm, it seems this class shares a lot of logic with CuratorElectionService. How are they different? Is there a way to share some logic?

src/test/scala/mesosphere/marathon/core/election/impl/PseudoElectionServiceTest.scala
33

It seems we are not testing CuratorElectionService. Is this impression correct?

63–64

Sweet!

ichernetsky added inline comments.Jul 13 2017, 4:34 PM
src/main/scala/mesosphere/marathon/core/election/impl/PseudoElectionService.scala
25–26

I see a vicious circle right there. I feel like I can write a book about trade-offs when implementing leader election which relies on an external service like ZooKeeper. :)

This is exactly what I was asked for from the very beginning: get rid of ElectionServiceBase. My first step was to simplify things, and reduce the number of states and their transitions in ElectionServiceBase, which lead to to renaming of it to ElectionServiceFSM because it started to look like a real FSM. It wasn't enough, and it led to this further simplification at the expense of having this duplicated code. It seems we are headed towards Akka Streams. This code duplication won't be there anymore after another round of refactoring of this code. On the other hand, if one imagines a two-level inheritance tree with one parent and multiple children as a horizontal thing as a apposed to a vertical thing, where the parent make calls into children and vice versa, it can be seen very well as a message-passing with compile-time checks of it.

Long story short, it is what it is. The decision was made, it is too late to change it drastically at this point. Though I do see that

  1. CuratorElectionService is harder to test now, because previously most of it was tested by testing ElectionServiceBase
  2. There is code duplication.

we should keep this intermediate approach, move on, and have it refactored to make it leverage Akka Streams later, after 1.10 release.

src/test/scala/mesosphere/marathon/core/election/impl/PseudoElectionServiceTest.scala
33

It is correct. Please refer to my comment to your first one. Most of it was tested using ElectionServiceBase/FSM before, but since most of the code is not shared between CuratorElectionService and PseudoElectionService, it is gone. Now testing of it would require running ZK, which looks more like SI testing at least.

kensipe accepted this revision.Jul 17 2017, 8:14 PM
jdef added inline comments.Jul 17 2017, 10:18 PM
build.sbt
22

seems unrelated to leader election. why is this change here?

src/main/scala/mesosphere/marathon/MarathonSchedulerActor.scala
442

why is this linter rule needed?

src/main/scala/mesosphere/marathon/core/base/CrashStrategy.scala
10

why not a sealed trait here?

14

case object (if we use sealed trait above)?

src/main/scala/mesosphere/marathon/core/election/impl/CuratorElectionService.scala
52

not that it's related to this change (because this just looks like a rename) but what's the impact of using a separate execution context here, vs. the "global" one that has support for the injected Context stuff Jason wrote?

https://github.com/mesosphere/marathon/blob/master/src/main/scala/mesosphere/marathon/core/async/ExecutionContexts.scala

55

leadershipOffered and acquiringLeadership function more like single-use latches. it might make sense to either (a) document that, or else; (b) wrap with a helper type that only allows a single-use one-way transition:

package mesosphere.marathon;
package core.async;

final class Latch {
  private[this] val counter = new Semaphore(1)
  def acquire: Option[Done] = 
    // use of Option return type forces pattern matches to deal w/ all alternatives, likely highly desirable
    if(counter.tryAcquire()) Some(Done) else None 
}

I like the idea of limiting transitions via a Latch type because it enforces the design as intended

145

nit: extract constant

220–231

looks like the ACLs changed here. why is this part of this PR and not a separate changeset? seems significant enough?

233

nit: extract constants

src/main/scala/mesosphere/marathon/core/election/impl/ElectionServiceMetrics.scala
17

is it possible that startMetrics and stopMetrics could be called concurrently? if so, the atomic boolean guard won't help you avoid a data race. if not, then i'd like to see that assumption documented someplace

src/main/scala/mesosphere/marathon/core/election/impl/PseudoElectionService.scala
45

ditto re: Latch suggestion

86

nit: extract constant

ichernetsky planned changes to this revision.Jul 17 2017, 11:20 PM
ichernetsky marked 6 inline comments as done.
ichernetsky added inline comments.
build.sbt
22

This patch has been in review for 3 months. And I address compiler warning every time I merge in origin/master.

src/main/scala/mesosphere/marathon/MarathonSchedulerActor.scala
442
[warn] /Users/ivanchernestsky/dev/marathon/src/main/scala/mesosphere/marathon/MarathonSchedulerActor.scala:442: [UseIfExpression] Assign the result of the if expression to variable ifres$macro$26 directly.
[warn]     if (knownTaskStatuses.nonEmpty) driver.reconcileTasks(knownTaskStatuses.asJavaCollection)
src/main/scala/mesosphere/marathon/core/election/impl/CuratorElectionService.scala
52

As far as I understand, it was done to let Curator be blocked in a separate thread, not in the one of https://github.com/mesosphere/marathon/blob/master/src/main/scala/mesosphere/marathon/core/async/ExecutionContexts.scala#L50

55

I don't think it'd better to replace those two variables with one Latch (if you meant this) because they used in two different places. It is correct that both of them are one-way (from false to true). If you meant with replacing them with two Latches, I think it is better not to do it at this point (frankly, it is quite late), because it will have to be full re-tested.

220–231

I noticed that the ACLs were incorrect a few days after I started working on the leader election simplification, and at that point I was new to the dev workflow here, and it seemed that I would take just a few more days to land this patch.

src/main/scala/mesosphere/marathon/core/election/impl/ElectionServiceMetrics.scala
17

It is done to avoid calling stopMetrics if startMetrics was not invoked before, or call stopMetrics twice (on both leader abdication and JVM shutdown).

ichernetsky marked an inline comment as done.
  • Address some of @jdef's comments
This revision is now accepted and ready to land.Jul 17 2017, 11:21 PM
jenkins requested changes to this revision.Jul 17 2017, 11:22 PM
This revision now requires changes to proceed.Jul 17 2017, 11:22 PM
  • Wrap ExecutionContent with ContextPropagatingExecutionContextWrapper
jenkins requested changes to this revision.Jul 17 2017, 11:38 PM
This revision now requires changes to proceed.Jul 17 2017, 11:38 PM
✗ Build of 3802 failed jenkins-public-marathon-phabricator-522.

Error message:

Stage Compile and Test failed.

(๑′°︿°๑)

jenkins accepted this revision.Jul 17 2017, 11:57 PM
This revision is now accepted and ready to land.Jul 17 2017, 11:57 PM
✔ Build of 3803 completed jenkins-public-marathon-phabricator-523.

You can create a DC/OS with your patched Marathon by creating a new pull
request with the following changes in buildinfo.json:

"url": "https://downloads.mesosphere.io/marathon/snapshots/marathon-1.5.0-SNAPSHOT-658-g4cf8c34.tgz",
"sha1": "a7091bc22ae6be5288d66c5954ecf1a085619d32"

\\ ٩( ᐛ )و //

This revision was automatically updated to reflect the committed changes.