Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pageserver: fix image layer skip while maintaing the one visit per layer property #9025

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

VladLazar
Copy link
Contributor

@VladLazar VladLazar commented Sep 17, 2024

Problem

Different keyspaces may require different floor LSNs in vectored
delta layer visits. This patch adds support for such cases.

Summary of Changes

  • Rework layer visit collection. Each layer type has a separate
    visit type which is aware of the requirements. For delta layers
    we track the floor LSN of keyspaces and merge only when that's
    matching. Other layer types do not have this requirement so they
    merge everything.
  • Thread the new visit types into the get_values_reconstruct_data
    calls. For delta layers, the code was adapted such that it may
    merge reads across keyspaces with different LSN floor requirements.
  • Tweak the fringe update code in get_vectored_reconstruct_data_timeline
    to handle different cont LSNs for different keyspaces. In practice,
    we will only update the fringe from one keyspace, since this "keyspace
    split" only happens when an image layer overlaps a delta layer (and
    image layers always complete all their keys).
  • Update tests with the new interfaces

Fixes #9012

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
  • If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

  • Do not forget to reformat commit message to not include the above checklist

@VladLazar VladLazar changed the title Vlad/fix vec read img layer skip take 2 pageserver: fix image layer skip while maintaing the one visit per layer property Sep 17, 2024
Copy link

github-actions bot commented Sep 17, 2024

4968 tests run: 4800 passed, 4 failed, 164 skipped (full report)


Failures on Postgres 17

Failures on Postgres 14

# Run all failed tests locally:
scripts/pytest -vv -n $(nproc) -k "test_scrubber_physical_gc_timeline_deletion[release-pg14] or test_replica_query_race[release-pg17] or test_hot_standby_feedback[debug-pg17] or test_neon_cli_basics[release-pg17]"
Flaky tests (12)

Postgres 17

Postgres 16

Postgres 14

Test coverage report is not available

The comment gets automatically updated with the latest test results
84d1af7 at 2024-09-19T10:45:16.312Z :recycle:

Problem

Different keyspaces may require different floor LSNs in vectored
delta layer visits. This patch adds support for such cases.

Summary of Changes

* Rework layer visit collection. Each layer type has a separate
visit type which is aware of the requirements. For delta layers
we track the floor LSN of keyspaces and merge only when that's
matching. Other layer types do not have this requirement so they
merge everything.
* Thread the new visit types into the `get_values_reconstruct_data`
calls. For delta layers, the code was adapted such that it may
merge reads across keyspaces with different LSN floor requirements.
* Tweak the fringe update code in `get_vectored_reconstruct_data_timeline`
to handle different cont LSNs for different keyspaces. In practice,
we will only update the fringe from one keyspace, since this "keyspace
split" only happens when an image layer overlaps a delta layer (and
image layers always complete all their keys).
* Update tests with the new interfaces
@VladLazar VladLazar force-pushed the vlad/fix-vec-read-img-layer-skip-take-2 branch from d6e102d to f6780a5 Compare September 18, 2024 12:35
`ValuesReconstructState::consume_done_keys` may only be called once
after a layer visit. The code in the previous commit called it for
each keyspace, resulting in keys not being marked done in the split
keyspace by floor LSN scenario.
@VladLazar VladLazar force-pushed the vlad/fix-vec-read-img-layer-skip-take-2 branch from f6780a5 to 84d1af7 Compare September 19, 2024 09:44
VladLazar added a commit that referenced this pull request Sep 19, 2024
## Problem

Different keyspaces may require different floor LSNs in vectored
delta layer visits. This patch adds support for such cases.

## Summary of changes

Different keyspaces wishing to read the same layer might
require different stop lsns (or lsn floor). The start LSN
of the read (or the lsn ceil) will always be the same.

With this observation, we fix skipping of image layers by
indexing the fringe by layer id plus lsn floor.

This is very simple, but means that we can visit delta layers twice
in certain cases. Still, I think it's very unlikely for any extra
merging to have taken place in this case, so perhaps it makes sense to go
with the simpler patch.

Fixes #9012
Alternative to #9025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Vectored get image layer skipping
1 participant