A backup and restore operation can be triggered via the leader endpoint.
The intent is stored in zk and performed by the next leading master.
A RuntimeConfiguration with related repository is implemented to store configuration collected during runtime.
PersistentStoreBackup now always uses an URI to define the location for backup & restore.
The value provided via the cmd-line parameter is only used before a migration needs to be applied.
Details
- Reviewers
jenkins • jasongilanfarr meichstedt timcharper - Commits
- rSDKCOREc818be525d7a: Make Backup & Restore operations available via /v2/leader API
rMARATHON31674fbe9593: Make Backup & Restore operations available via /v2/leader API - JIRA Issues
- JIRA MARATHON-7046 Endpoint to Restore on next leader election
JIRA MARATHON-7044 Endpoint for backup on next leader election.
sbt test
Diff Detail
- Repository
- rMARATHON marathon
- Lint
Automatic diff as part of commit; lint not applicable. - Unit
Automatic diff as part of commit; unit tests not applicable.
Build is green https://jenkins.mesosphere.com/service/jenkins/job/public-test-marathon-phabricator/2069/ for more details.
Will take a second pass later today. The raml generator takes in the files to generate from, so putting that type in src/main/raml would be super.
src/main/scala/mesosphere/util/state/RuntimeConfiguration.scala | ||
---|---|---|
8 ↗ | (On Diff #2524) | Can we make this in raml please? I was playing with using src/main/raml at one point so we could make a clear delineation that some stuff is internal types. |
src/main/scala/mesosphere/marathon/storage/store/ZkStoreSerialization.scala | ||
---|---|---|
190 | minor, but there are a bunch of constants for UTF-8, e.g. StandardCharsets.UTF8 |
√ Build of 2649 completed at https://jenkins.mesosphere.com/service/jenkins/job/public-marathon-phabricator-pipeline/201/
Can you add to the changelog and fix the "UTF-8" to use StandardCharsets.UTF_8? Other than that... I still accept.
Docs please:
- The documentation in the RAML files is pretty sparse w/ respect to the backup and restore feature when DELETE'ing leader.
- Maybe a more elaborate explanation should be provided in docs/docs/...
And I object to encoding credentials into the backup/restore URI -- we should allow admins to rely on IAM to avoid credential leakage.
src/main/scala/mesosphere/marathon/api/v2/LeaderResource.scala | ||
---|---|---|
48–54 | this would probably be more clear (vs nested withValid and the validators outside the block, several lines away): assumeValid { validate(block)(optional(UriIO.valid)) validate(restore)(optional(UriIO.valid)) result(...) ... } |
- Describe RuntimeConfiguration in raml and use the generated type.
- Add API documentation for the leader endpoint.
- Incorporate feedback from review.
@jdef
Documentation will be addressed with JIRA MARATHON-2040 Write a gh-pages site for backup & restore
Providing credentials as part of the url will be addressed with JIRA MARATHON-7235 S3 Credentials are too restrictive
√ Build of 2750 completed at https://jenkins.mesosphere.com/service/jenkins/job/public-marathon-phabricator-pipeline/307/
√ Build of 2774 completed at https://jenkins.mesosphere.com/service/jenkins/job/public-marathon-phabricator-pipeline/332/
docs/docs/rest-api/public/api/v2/leader.raml | ||
---|---|---|
27 | I think we should do better here explaining how delete could trigger a re-election. We don't need to duplicate word for word what we have written somewhere else, but this is still too brief in my opinion. |
nice docs changes
docs/docs/rest-api/public/api/v2/types/pragma.raml | ||
---|---|---|
36 | We need some kind of compile-time separation between public and internal types. (a) I wonder if internal types should actually live in a separate package? Grouping them together in the same package as public API types is confusing and will lead to mistakes down the line. (b) What rules can we implement in the RAML generator to enforce such separation? For example, does it make sense to implement a check that public types cannot refer to internal types? Check that API endpoints don't refer to any internal types? At the very least please file a JIRA to track this concern |
√ Build of 2867 completed at https://jenkins.mesosphere.com/service/jenkins/job/public-marathon-phabricator-pipeline/435/
I think we should do better here explaining how delete could trigger a re-election. We don't need to duplicate word for word what we have written somewhere else, but this is still too brief in my opinion.