[VAULT-28313] enos: decode scenarios iteratively #147

ryancragun · 2024-06-20T21:15:01Z

Significantly improve the speeed and reliability of list and validate in large repositories. While the prior implementation did list and validate in parallel, it also retained whole sets in memory even if they were not being used. Holding onto those references was catastrophic on machines with scarce resources. We replace the old implementation with a scenario decoder that's an iterator which allows the caller to handle the scenario if necessary or allow it to be garbage collected when we're through with it.

This improves our memory usage with list and validate significantly. When testing validate against the Vault replication scenario with our prior implementation, our memory usage grew linerally until reaching ~11GB. Now our memory stays flat in the 300-400mb range.

Overall it improves listing wall clock speed by 10% and our validate speed by 15% when validating a subset of scenarios. On my machine you couldn't successfully validate all of them with the prior implementation and now it works just fine.

There's still room for improvement on listing speed. If we cared less about format padding we could change the UI to be incremental for the basic UI and then render the scenarios on screen while they are being decoded. I chose not to do that as it's outside the scope of what we're trying to resolve.

There's also significant memory usage when creating a sample frame and validating and/or making an observation it. There are likely subsequent improvements we could make there but those don't appear to be a blocker to fixing our validator.

Backwards incompatible changes:

We now return an error when attempting to list scenarios in a directory without any defined scenario blocks.
We now sort our matrix block by variant and elements and the ordering can be slightly different than before. I doubt anybody would notice but it's possible.
We don't guarantee perfectly sorted output when listing as it is done in parallel and we aren't caching and sorting.

Changes:

Create a scenario decoder that implements an iterator pattern
Utilize the iterator pattern in various endpoints and tests
Fix a race in the warnings as errors check acceptance test
Bump version
Pin to the latest actions to resolve deprecation warnings

Checklist

The commit message includes an explanation of the changes
Manual validation of the changes have been performed (if possible)
New or modified code has requisite test coverage (if possible)
I have performed a self-review of the changes
I have made necessary changes and/or pull requests for documentation
I have written useful comments in the code

ryancragun · 2024-06-20T21:41:48Z

acceptance/scenario_sample_observe_test.go

@@ -112,12 +112,12 @@ func TestAcc_Cmd_Scenario_Sample_Observe(t *testing.T) {
 							Scenario: &pb.Ref_Scenario{
 								Id: &pb.Scenario_ID{
 									Name:   "smoke",
-									Uid:    "21610357b0083126896a0072a429e677fd6381fa98e94b15ac2101f697100b42",


This change is due to the fact that we now sort our matrix when we decode and filter it and before we did not do that.

ryancragun · 2024-06-20T21:43:16Z

internal/command/enos/cmd/scenario_list.go

@@ -61,7 +61,6 @@ func runScenarioListCmd(cmd *cobra.Command, args []string) error {
 		switch val := msg.GetResponse().(type) {
 		case *pb.EnosServiceListScenariosResponse_Decode:
 			res.Decode = val.Decode
-			res.Diagnostics = val.Decode.GetDiagnostics()


This caused double diagnostics which we don't want

ryancragun · 2024-06-20T21:44:32Z

internal/flightplan/decoder_test.go

-	require.Len(t, expected.Modules, len(fp.Modules))
-	require.Len(t, expected.ScenarioBlocks, len(fp.ScenarioBlocks))
-	require.Len(t, expected.Providers, len(fp.Providers))
+	require.Len(t, fp.Modules, len(expected.Modules))


These were in the wrong order which I discovered while testing my in-flight changes, so I fixed them here.

ryancragun · 2024-06-20T21:44:43Z

internal/flightplan/decoder_test.go

@@ -271,115 +387,3 @@ func testMostlyEqualStepVar(t *testing.T, expected cty.Value, got cty.Value) {
 		require.EqualValues(t, eAttr.Name, aAttr.Name)
 	}
 }
-


I moved this up in the file

ryancragun · 2024-06-20T21:45:47Z

internal/flightplan/matrix_decoder.go

@@ -16,7 +16,8 @@ import (

 type matrixDecoder struct{}

-type DecodedMatrices struct {
+// MatrixBlock represent a full "matrix" block at various stages.
+type MatrixBlock struct {


I renamed DecodeMatrices to MatrixBlock and DecodedScenarioBlock'(s) to ScenarioBlock because they're shorter, more consistent, and clearer.

internal/flightplan/scenario_decoder.go

ryancragun · 2024-06-20T21:54:21Z

internal/flightplan/scenario_decoder.go

 	}

 	return d, nil
 }

-// DecodedScenarioBlock is a decoded scenario block.
-type DecodedScenarioBlock struct {


This became ScenarioBlock and the implementation changed.

Significantly improve the speeed and reliability of `list` and `validate` in large repositories. While the prior implementation did list and validate in parallel, it also retained whole sets in memory even if they were not being used. Holding onto those references was catastrophic on machines with scarce resources. We replace the old implementation with a scenario decoder that's an iterator which allows the caller to handle the scenario if necessary or allow it to be garbage collected when we're through with it. This improves our memory usage with `list` and `validate` significantly. When testing `validate` against the Vault `replication` scenario with our prior implementation, our memory usage grew linerally until reaching ~11GB. Now our memory stays flat in the 300-400mb range. Overall it improves listing wall clock speed by 10% and our validate speed by 15% when validating a subset of scenarios. On my machine you couldn't successfully validate all of them with the prior implementation and now it works just fine. There's still room for improvement on listing speed. If we cared less about format padding we could change the UI to be incremental for the basic UI and then render the scenarios on screen while they are being decoded. I chose not to do that as it's outside the scope of what we're trying to resolve. There's also significant memory usage when creating a sample frame and validating and/or making an observation it. There are likely subsequent improvements we could make there but those don't appear to be a blocker to fixing our validator. Backwards incompatible changes: * We now return an error when attempting to list scenarios in a directory without any defined scenario blocks. * We now sort our matrix block by variant and elements and the ordering can be slightly different than before. I doubt anybody would notice but it's possible. * We don't guarantee perfectly sorted output when listing as it is done in parallel and we aren't caching and sorting. Changes: * Create a scenario decoder that implements an iterator pattern * Utilize the iterator pattern in various endpoints and tests * Fix a race in the warnings as errors check acceptance test * Bump version * Pin to the latest actions to resolve deprecation warnings Signed-off-by: Ryan Cragun <[email protected]>

ryancragun added the changelog/feat New feature or enhancement. Will be included in "New Features" category in release notes. label Jun 20, 2024

ryancragun requested a review from a team as a code owner June 20, 2024 21:15

ryancragun force-pushed the vault-28313 branch 3 times, most recently from a0d7988 to 4c58961 Compare June 20, 2024 21:35

ryancragun commented Jun 20, 2024

View reviewed changes

internal/flightplan/scenario_decoder.go Outdated Show resolved Hide resolved

ryancragun commented Jun 20, 2024

View reviewed changes

ryancragun force-pushed the vault-28313 branch from 4c58961 to 5b8ea24 Compare June 20, 2024 22:16

ryancragun merged commit a4eff3d into main Jun 24, 2024
5 checks passed

ryancragun deleted the vault-28313 branch June 24, 2024 15:36

ryancragun mentioned this pull request Jun 24, 2024

VAULT-28313: Consider available memory when validating samples #149

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[VAULT-28313] enos: decode scenarios iteratively #147

[VAULT-28313] enos: decode scenarios iteratively #147

ryancragun commented Jun 20, 2024 •

edited

Loading

ryancragun Jun 20, 2024

ryancragun Jun 20, 2024

ryancragun Jun 20, 2024

ryancragun Jun 20, 2024

ryancragun Jun 20, 2024

ryancragun Jun 20, 2024

[VAULT-28313] enos: decode scenarios iteratively #147

[VAULT-28313] enos: decode scenarios iteratively #147

Conversation

ryancragun commented Jun 20, 2024 • edited Loading

Checklist

ryancragun Jun 20, 2024

Choose a reason for hiding this comment

ryancragun Jun 20, 2024

Choose a reason for hiding this comment

ryancragun Jun 20, 2024

Choose a reason for hiding this comment

ryancragun Jun 20, 2024

Choose a reason for hiding this comment

ryancragun Jun 20, 2024

Choose a reason for hiding this comment

ryancragun Jun 20, 2024

Choose a reason for hiding this comment

ryancragun commented Jun 20, 2024 •

edited

Loading