Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve UX for mocking dataclasses that are problematic to instantiate #124176

Open
ncoghlan opened this issue Sep 17, 2024 · 2 comments
Open

Improve UX for mocking dataclasses that are problematic to instantiate #124176

ncoghlan opened this issue Sep 17, 2024 · 2 comments
Assignees
Labels
topic-dataclasses type-feature A feature request or enhancement

Comments

@ncoghlan
Copy link
Contributor

ncoghlan commented Sep 17, 2024

Feature or enhancement

Proposal:

Passing instance=True to create_autospec misses fields without default values, even when the given spec is a dataclass object:

>>> from dataclasses import dataclass
>>> @dataclass
... class Example:
...     a: int
...     b: int = 0
...     def sum_fields(self):
...         return self.a + self.b
...
>>> from unittest.mock import create_autospec
>>> mock_instance = create_autospec(Example(1))
>>> mock_instance_from_class = create_autospec(Example, instance=True)
>>> set(dir(mock_instance)) - set(dir(mock_instance_from_class))
{'a'}

This is despite dataclass definitions making their field information readily available for introspection on the class object (without requiring instantiation):

>>> from dataclasses import fields
>>> [(field.name, field.type) for field in fields(Example)]
[('a', <class 'int'>), ('b', <class 'int'>)]

A similar problem occurs if the dataclass uses __post_init__ to set attributes that are not otherwise set:

>>> from dataclasses import dataclass, field
>>> @dataclass
... class PostInitExample:
...     a: int = field(init=False)
...     b: int = field(init=False)
...     def __post_init__(self):
...         self.a = self.b = 1
...
>>> mock_post_init_class = create_autospec(Example)
>>> mock_post_init_class = create_autospec(PostInitExample)
>>> mock_post_init_instance = create_autospec(PostInitExample())
>>> set(dir(mock_post_init_instance)) - set(dir(mock_post_init_class()))
{'b', 'a'}

While for most dataclasses it is straightforward to instantiate a specific instance and derive the mock autospec from that, this may not be desirable (or even feasible) if the dataclass requires references to real external resources to create a real instance.

There are various potential workarounds available for this functional gap, but they're all relatively clumsy, and come with their own problems (like not being able to use spec_set=True if the missing fields are added manually, or not handling defined methods properly if setting an explicit list of fields instead of using autospec, or still not adding the fields only defined in __post_init__ if instantiating a class mock).

By contrast, if create_autospec were to be made explicitly aware of data classes, it could do a pass over dataclasses.fields(spec) and use the type information to fill in any missing fields that don't have class level default values set.

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

Previously filed here, but closed on the basis of create_autospec covering the use case: #80761

This is only true if the dataclass can be readily instantiated, hence this feature request.

There is also some previous discussion (and assorted attempted workarounds with various flaws) on this Stack Overflow question: https://stackoverflow.com/questions/51640505/how-to-use-spec-when-mocking-data-classes-in-python

@ncoghlan ncoghlan added the type-feature A feature request or enhancement label Sep 17, 2024
@sobolevn
Copy link
Member

Great idea! @ncoghlan do you want to work on this? Or can I take this over? :)

@ncoghlan
Copy link
Contributor Author

I already worked around it for my use case (leaving spec_set as False and adding in the missing attributes manually), so please feel free to take it on!

I think the basic cases where the field types are just a simple type will be straightforward (use create_autospec(..., instance=True) recursively), and the some_type|None/Optional[type] case can ignore the None branch and do the same. ClassVar and InitVar will already be filtered out by the fields(...) call.

Checking the result of dir(str|int) it might be OK to simply not do anything special for non-trival unions and other more complex type definitions like generic containers. The mock may end up with some additional methods it wouldn't otherwise have, but that's probably an acceptable limitation. For example:

>>> dir(str|int)
['__args__', '__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__or__', '__parameters__', '__reduce__', '__reduce_ex__', '__repr__', '__ror__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__']

>>> set(dir(tuple[str, int, str])) - set(dir(tuple))
{'__deepcopy__', '__unpacked__', '__typing_unpacked_tuple_args__', '__parameters__', '__copy__', '__args__', '__mro_entries__', '__origin__'}

@sobolevn sobolevn self-assigned this Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-dataclasses type-feature A feature request or enhancement
Projects
Status: Todo
Development

No branches or pull requests

3 participants