Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VAULT-28313] enos: decode scenarios iteratively #147

Merged
merged 1 commit into from
Jun 24, 2024
Merged

Conversation

ryancragun
Copy link
Collaborator

@ryancragun ryancragun commented Jun 20, 2024

Significantly improve the speeed and reliability of list and validate in large repositories. While the prior implementation did list and validate in parallel, it also retained whole sets in memory even if they were not being used. Holding onto those references was catastrophic on machines with scarce resources. We replace the old implementation with a scenario decoder that's an iterator which allows the caller to handle the scenario if necessary or allow it to be garbage collected when we're through with it.

This improves our memory usage with list and validate significantly. When testing validate against the Vault replication scenario with our prior implementation, our memory usage grew linerally until reaching ~11GB. Now our memory stays flat in the 300-400mb range.

Overall it improves listing wall clock speed by 10% and our validate speed by 15% when validating a subset of scenarios. On my machine you couldn't successfully validate all of them with the prior implementation and now it works just fine.

There's still room for improvement on listing speed. If we cared less about format padding we could change the UI to be incremental for the basic UI and then render the scenarios on screen while they are being decoded. I chose not to do that as it's outside the scope of what we're trying to resolve.

There's also significant memory usage when creating a sample frame and validating and/or making an observation it. There are likely subsequent improvements we could make there but those don't appear to be a blocker to fixing our validator.

Backwards incompatible changes:

  • We now return an error when attempting to list scenarios in a directory without any defined scenario blocks.
  • We now sort our matrix block by variant and elements and the ordering can be slightly different than before. I doubt anybody would notice but it's possible.
  • We don't guarantee perfectly sorted output when listing as it is done in parallel and we aren't caching and sorting.

Changes:

  • Create a scenario decoder that implements an iterator pattern
  • Utilize the iterator pattern in various endpoints and tests
  • Fix a race in the warnings as errors check acceptance test
  • Bump version
  • Pin to the latest actions to resolve deprecation warnings

Checklist

  • The commit message includes an explanation of the changes
  • Manual validation of the changes have been performed (if possible)
  • New or modified code has requisite test coverage (if possible)
  • I have performed a self-review of the changes
  • I have made necessary changes and/or pull requests for documentation
  • I have written useful comments in the code

@ryancragun ryancragun added the changelog/feat New feature or enhancement. Will be included in "New Features" category in release notes. label Jun 20, 2024
@ryancragun ryancragun requested a review from a team as a code owner June 20, 2024 21:15
@ryancragun ryancragun force-pushed the vault-28313 branch 3 times, most recently from a0d7988 to 4c58961 Compare June 20, 2024 21:35
@@ -112,12 +112,12 @@ func TestAcc_Cmd_Scenario_Sample_Observe(t *testing.T) {
Scenario: &pb.Ref_Scenario{
Id: &pb.Scenario_ID{
Name: "smoke",
Uid: "21610357b0083126896a0072a429e677fd6381fa98e94b15ac2101f697100b42",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is due to the fact that we now sort our matrix when we decode and filter it and before we did not do that.

@@ -61,7 +61,6 @@ func runScenarioListCmd(cmd *cobra.Command, args []string) error {
switch val := msg.GetResponse().(type) {
case *pb.EnosServiceListScenariosResponse_Decode:
res.Decode = val.Decode
res.Diagnostics = val.Decode.GetDiagnostics()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This caused double diagnostics which we don't want

require.Len(t, expected.Modules, len(fp.Modules))
require.Len(t, expected.ScenarioBlocks, len(fp.ScenarioBlocks))
require.Len(t, expected.Providers, len(fp.Providers))
require.Len(t, fp.Modules, len(expected.Modules))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were in the wrong order which I discovered while testing my in-flight changes, so I fixed them here.

@@ -271,115 +387,3 @@ func testMostlyEqualStepVar(t *testing.T, expected cty.Value, got cty.Value) {
require.EqualValues(t, eAttr.Name, aAttr.Name)
}
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this up in the file

@@ -16,7 +16,8 @@ import (

type matrixDecoder struct{}

type DecodedMatrices struct {
// MatrixBlock represent a full "matrix" block at various stages.
type MatrixBlock struct {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed DecodeMatrices to MatrixBlock and DecodedScenarioBlock'(s) to ScenarioBlock because they're shorter, more consistent, and clearer.

}

return d, nil
}

// DecodedScenarioBlock is a decoded scenario block.
type DecodedScenarioBlock struct {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This became ScenarioBlock and the implementation changed.

Significantly improve the speeed and reliability of `list` and `validate`
in large repositories. While the prior implementation did list and
validate in parallel, it also retained whole sets in memory even if they
were not being used. Holding onto those references was catastrophic on
machines with scarce resources. We replace the old implementation with a
scenario decoder that's an iterator which allows the caller to handle the
scenario if necessary or allow it to be garbage collected when we're
through with it.

This improves our memory usage with `list` and `validate` significantly.
When testing `validate` against the Vault `replication` scenario with our
prior implementation, our memory usage grew linerally until reaching
~11GB. Now our memory stays flat in the 300-400mb range.

Overall it improves listing wall clock speed by 10% and our validate
speed by 15% when validating a subset of scenarios. On my machine you
couldn't successfully validate all of them with the prior implementation
and now it works just fine.

There's still room for improvement on listing speed. If we cared less
about format padding we could change the UI to be incremental for the
basic UI and then render the scenarios on screen while they are being
decoded. I chose not to do that as it's outside the scope of what we're
trying to resolve.

There's also significant memory usage when creating a sample frame and
validating and/or making an observation it. There are likely subsequent
improvements we could make there but those don't appear to be a blocker
to fixing our validator.

Backwards incompatible changes:
  * We now return an error when attempting to list scenarios in a
    directory without any defined scenario blocks.
  * We now sort our matrix block by variant and elements and the ordering
    can be slightly different than before. I doubt anybody would notice
    but it's possible.
  * We don't guarantee perfectly sorted output when listing as it is done
    in parallel and we aren't caching and sorting.

Changes:
  * Create a scenario decoder that implements an iterator pattern
  * Utilize the iterator pattern in various endpoints and tests
  * Fix a race in the warnings as errors check acceptance test
  * Bump version
  * Pin to the latest actions to resolve deprecation warnings

Signed-off-by: Ryan Cragun <[email protected]>
@ryancragun ryancragun merged commit a4eff3d into main Jun 24, 2024
5 checks passed
@ryancragun ryancragun deleted the vault-28313 branch June 24, 2024 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog/feat New feature or enhancement. Will be included in "New Features" category in release notes.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant