Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different results between standard library json and orjson #225

Open
unratito opened this issue May 29, 2024 · 3 comments
Open

Different results between standard library json and orjson #225

unratito opened this issue May 29, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@unratito
Copy link

unratito commented May 29, 2024

  • mashumaro version: 3.13
  • Python version: 3.12.3
  • Operating System: Windows 11

Description

When using mixins to serialize data classes to JSON, standard library json and orjson give different results.

What I Did

This code uses standard library json:

from dataclasses import dataclass
from mashumaro.mixins.json import DataClassJSONMixin
import json

@dataclass
class A(DataClassJSONMixin):
    x: int

@dataclass
class B(A):
    y: str

@dataclass
class W(DataClassJSONMixin):
    inner: A

b = B(5, 'hi')
w = W(b)

print(json.dumps(w.to_dict()))
print(w.to_json())

And it prints these results:

{"inner": {"x": 5, "y": "hi"}}
{"inner": {"x": 5, "y": "hi"}}

While this equivalent code uses orjson:

from dataclasses import dataclass
from mashumaro.mixins.orjson import DataClassORJSONMixin
import orjson

@dataclass
class A(DataClassORJSONMixin):
    x: int

@dataclass
class B(A):
    y: str

@dataclass
class W(DataClassORJSONMixin):
    inner: A

b = B(5, 'hi')
w = W(b)

print(orjson.dumps(w.to_dict()))
print(w.to_json())

And it prints these other results:

b'{"inner":{"x":5,"y":"hi"}}'
{"inner":{"x":5}}
@Fatal1ty
Copy link
Owner

I over-optimized the serialization of dataclasses using orjson to such an extent that it led to unpleasant consequences that I overlooked 😅. In short, when we build the serialization code for W, we build the code for turning dataclass A into a dictionary with types supported by orjson, since A is specified for inner. At runtime for B, this method will be called from the parent class A without the specific field y. I need to think more about what to do, since I don’t yet see any simple solutions other than getting rid of this optimization, which I wouldn’t want to do.

@Fatal1ty
Copy link
Owner

Fatal1ty commented Jun 27, 2024

Here is another example of this issue:

from dataclasses import dataclass
from mashumaro import DataClassDictMixin

@dataclass
class A[T]:
    x: T

@dataclass
class B[T](A):
    y: int

@dataclass
class C(DataClassDictMixin):
    z: A[int]

print(C(B(1, 2)).to_dict())  # {'z': {'x': 1}}

@Fatal1ty
Copy link
Owner

Looks like mashumaro is not the only one library which has this behavior 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants