Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create table from source: source limits are not applied #29556

Open
Tracked by #28430
nrainer-materialize opened this issue Sep 16, 2024 · 1 comment
Open
Tracked by #28430

create table from source: source limits are not applied #29556

nrainer-materialize opened this issue Sep 16, 2024 · 1 comment
Assignees
Labels
C-bug Category: something is broken

Comments

@nrainer-materialize
Copy link
Contributor

What version of Materialize are you using?

e8e6737

What is the issue?

Given

$ postgres-execute connection=mz_system
ALTER SYSTEM SET max_sources = 2
ALTER SYSTEM SET max_tables = 2

> CREATE SOURCE mz_source FROM POSTGRES CONNECTION pg (PUBLICATION 'mz_source');
> CREATE SOURCE mz_source2 FROM POSTGRES CONNECTION pg (PUBLICATION 'mz_source');

the statement

> CREATE SOURCE mz_source3 FROM POSTGRES CONNECTION pg (PUBLICATION 'mz_source');

should fail with "creating source would violate max_sources limit (desired: 3, limit: 2, current: 0)" but does not.

@rjobanp
Copy link
Contributor

rjobanp commented Sep 17, 2024

@nrainer-materialize I don't know that this is an issue, or at least it's working as desired based on the current implementation:

The limit on sources actually appears to be a limit on the number of 'persist-shards' in-use by a source:

let current_sources: usize = self
.catalog()
.user_sources()
.filter_map(|source| source.source())
.map(|source| source.user_controllable_persist_shard_count())
.sum::<i64>()
.try_into()
.expect("non-negative sum of sources");
self.validate_resource_limit(
current_sources,
new_sources,
SystemVars::max_sources,
"source",
MAX_SOURCES.name(),
)?;

The limitation uses this method to determine how many 'persist shards' are actually needed by a given source:

/// The expensive resource that each source consumes is persist shards. To
/// prevent abuse, we want to prevent users from creating sources that use an
/// unbounded number of persist shards. But we also don't want to count
/// persist shards that are mandated by teh system (e.g., the progress
/// shard) so that future versions of Materialize can introduce additional
/// per-source shards (e.g., a per-source status shard) without impacting
/// the limit calculation.
pub fn user_controllable_persist_shard_count(&self) -> i64 {
match &self.data_source {
DataSourceDesc::Ingestion { ingestion_desc, .. } => {
match &ingestion_desc.desc.connection {
// These multi-output sources do not use their primary
// source's data shard, so we don't include it in accounting
// for users.
GenericSourceConnection::Postgres(_) | GenericSourceConnection::MySql(_) => 0,
GenericSourceConnection::LoadGenerator(lg) => {
// TODO: make this a method on the load generator.
if lg.load_generator.views().is_empty() {
// Load generator writes directly to its persist shard
1
} else {
// Load generator has 1 persist shard per output,
// which will be accounted for by `SourceExport`.
0
}
}
GenericSourceConnection::Kafka(_) => 1,
}
}
// DataSourceDesc::IngestionExport represents a subsource, which
// use a data shard.
DataSourceDesc::IngestionExport { .. } => 1,
DataSourceDesc::Webhook { .. } => 1,
// Introspection and progress subsources are not under the user's control, so shouldn't
// count toward their quota.
DataSourceDesc::Introspection(_) | DataSourceDesc::Progress => 0,
}
}
}

Given that the CREATE SOURCE mz_source FROM POSTGRES CONNECTION pg (PUBLICATION 'mz_source'); statement does not create any subsources, there are no data shards being created for the statement, such that it doesn't affect the overall limit.

If we want to make this limit sources based on the number of source-statements, then we would have to change this logic, but since I did not implement or design the original behavior I am not sure that is something we want. @morsapaes do you have an opinion on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: something is broken
Projects
None yet
Development

No branches or pull requests

2 participants