Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storcon: update compute hook state on detach #9045

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 61 additions & 0 deletions storage_controller/src/compute_hook.rs
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,37 @@ impl ComputeHookTenant {
}
}

fn is_sharded(&self) -> bool {
matches!(self, ComputeHookTenant::Sharded(_))
}

/// Clear compute hook state for the specified shard.
/// Only valid for [`ComputeHookTenant::Sharded`] instances.
fn remove_shard(&mut self, tenant_shard_id: TenantShardId, stripe_size: ShardStripeSize) {
match self {
ComputeHookTenant::Sharded(sharded) => {
if sharded.stripe_size != stripe_size
|| sharded.shard_count != tenant_shard_id.shard_count
{
tracing::warn!("Shard split detected while handling detach")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I guess in this case we never complete...? Though, I am quite unsure what is completion, referring to the previous thread.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. In this case we do not update the in-memory compute hook state because we have raced with a shard split. Shard splits reset the compute hook state, but with the new shard count.

}

let shard_idx = sharded.shards.iter().position(|(shard_number, _node_id)| {
*shard_number == tenant_shard_id.shard_number
});

if let Some(shard_idx) = shard_idx {
sharded.shards.remove(shard_idx);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also invalidate the last sent notification?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an open question. This PR is intended to fix a specific case:

  1. Tenant is detached manually by setting its scheduling policy
  2. Tenant is re-attached manually by setting its scheduling policy
  3. Stale compute notification is sent for the tenant because we did not clear the compute hook state in (1)
  4. Compute gets stuck because it can't connect to specified pageservers

The question of how to handle operator driven detaches needs some design work:

  • Should we notify the control plane about detaches?
  • How do we ensure that a stale notification is never sent out? Some cross reconcile coordination would be required.

} else {
tracing::warn!("Shard not found while handling detach")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsure how actionable this is, but lets include the

Suggested change
tracing::warn!("Shard not found while handling detach")
tracing::warn!(
shard_number=%tenant_shard_id.shard_number,
"Shard not found while handling detach"
);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already included as the shard id. Can add it, but feels a bit redundant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in included in some span? If so, sounds like an opportunity to assert that the span has shard_number.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The span doesn't have the shard number, but the shard number can be inferred from the shard_id which is present in the span. I can assert that's present if you like.

}
}
ComputeHookTenant::Unsharded(_) => {
unreachable!("Detach of unsharded tenants is handled externally");
}
}
}

/// Set one shard's location. If stripe size or shard count have changed, Self is reset
/// and drops existing content.
fn update(
Expand Down Expand Up @@ -614,6 +645,36 @@ impl ComputeHook {
self.notify_execute(maybe_send_result, tenant_shard_id, cancel)
.await
}

/// Reflect a detach for a particular shard in the compute hook state.
///
/// The goal is to avoid sending compute notifications with stale information (i.e.
/// including detach pageservers).
#[tracing::instrument(skip_all, fields(tenant_id=%tenant_shard_id.tenant_id, shard_id=%tenant_shard_id.shard_slug()))]
pub(super) fn handle_detach(
&self,
tenant_shard_id: TenantShardId,
stripe_size: ShardStripeSize,
) {
use std::collections::hash_map::Entry;

let mut state_locked = self.state.lock().unwrap();
match state_locked.entry(tenant_shard_id.tenant_id) {
Entry::Vacant(_) => {
tracing::warn!("Compute hook tenant not found for detach");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this warning actionable in some way, or just unexpected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it wouldn't happen under normal operating conditions, but the tests think otherwise. Looking into it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, fixed it. To answer your original question. It's unexpected.

}
Entry::Occupied(mut e) => {
let sharded = e.get().is_sharded();
if !sharded {
e.remove();
} else {
e.get_mut().remove_shard(tenant_shard_id, stripe_size);
}

tracing::debug!("Compute hook handled shard detach");
}
}
}
}

#[cfg(test)]
Expand Down
10 changes: 10 additions & 0 deletions storage_controller/src/reconciler.rs
Original file line number Diff line number Diff line change
Expand Up @@ -820,6 +820,16 @@ impl Reconciler {
self.location_config(&node, conf, None, false).await?;
}

// The condition below identifies a detach. We must have no attached intent and
// must have been attached to something previously. Pass this information to
// the [`ComputeHook`] such that it can update its tenant-wide state.
if self.intent.attached.is_none() && !self.detach.is_empty() {
// TODO: Consider notifying control plane about detaches. This would avoid situations
// where the compute tries to start-up with a stale set of pageservers.
self.compute_hook
.handle_detach(self.tenant_shard_id, self.shard.stripe_size);
}

failpoint_support::sleep_millis_async!("sleep-on-reconcile-epilogue");

Ok(())
Expand Down
Loading