Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: pybind11's initialization of internals needs to be made more thread safe in the presence of free threading #5316

Open
3 tasks done
EthanSteinberg opened this issue Aug 18, 2024 · 2 comments
Assignees
Labels

Comments

@EthanSteinberg
Copy link
Collaborator

EthanSteinberg commented Aug 18, 2024

Required prerequisites

What version (or hash if on master) of pybind11 are you using?

a1d0091

Problem description

PYBIND11_NOINLINE internals &get_internals() {
is not thread safe under free-threading.

internals is a global singleton struct shared across all pybind11 modules. get_internals() either retrieves that current global singleton or creates a new one.

The problem is double initialization, where two copies of internals would be created by multiple modules being imported/initalized in different threads.

if (object internals_obj = get_internals_obj_from_state_dict(state_dict)) {
internals_pp = get_internals_pp_from_capsule(internals_obj);
}
if (internals_pp && *internals_pp) {
// We loaded the internals through `state_dict`, which means that our `error_already_set`
are the key problematic lines where we retrieve the current global, and initialize it if necessary.

The problem is that if two threads hit those lines at the same time, both threads could see that internals is null and both threads could initialize internals, which would lead to two copies.

Currently this is protected via the GIL (https://github.com/pybind/pybind11/blob/master/include/pybind11/detail/internals.h#L505), but that is problematic with the new python free threading setup.

In the short term this is still OK because the GIL is re-enabled during module import, but that is not a good long term strategy.

Reproducible example code

No response

Is this a regression? Put the last known working version here if it is.

Not a regression

@EthanSteinberg
Copy link
Collaborator Author

PyDict_SetDefaultRef appears to be the key for making this actually safe. It enables a conditional set on a dictionary.

@rwgk
Copy link
Collaborator

rwgk commented Aug 20, 2024

Extra background (to the best of my knowledge, please correct if necessary):

I believe it would be good to work this into the PR description, and to explain the situations in which get_internals() might be called without being protected either by a mutex or the module import protections.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants