Skip to content

Conversation

@whyvineet
Copy link

@whyvineet whyvineet commented Jan 25, 2026

Simplify single-field dataclass ordering comparisons
#144191

@whyvineet whyvineet requested a review from ericvsmith as a code owner January 25, 2026 15:33
@bedevere-app
Copy link

bedevere-app bot commented Jan 25, 2026

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

Copy link
Member

@johnslavik johnslavik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! You can wait for the green light from Eric before going forward or add tests for this even now. I'd probably change test_1_field_compare. And of course a news entry, too.

@bedevere-app
Copy link

bedevere-app bot commented Jan 26, 2026

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@whyvineet
Copy link
Author

Hey @picnixz and @johnslavik, just a quick update: I ran a small local benchmark on a CPython 3.15 dev build to ensure this change doesn't cause any noticeable slowdown. In my tests, the match/case version was slightly faster in the single-field scenario and comparable in the multi-field case. Since this code executes only during class creation, I didn’t notice any adverse effects.

I’m happy to follow your advice on this. If you’d rather test it yourselves or revert to the if/else approach for clarity, just let me know, and I’ll update the PR accordingly.

@whyvineet
Copy link
Author

Just for reference, here are the numbers I observed locally (CPython 3.15 dev build):

  • Single-field case: match/case ~8–9% faster than if len(flds) == 1
  • Multiple-field case: no meaningful difference (well under 1%)

These were consistent across repeated runs.

@picnixz
Copy link
Member

picnixz commented Jan 29, 2026

Please share the benchmarking script and the way you ran it. Class creation matters when considering import time as well.

@whyvineet
Copy link
Author

The script was executed using the freshly built interpreter. Each implementation (match/case vs if/else) is run back-to-back for a large number of iterations to reduce noise.

Script
import timeit
import sys
from dataclasses import dataclass

class MockField:
    def __init__(self, name):
        self.name = name

def _tuple_str(obj_name, flds):
    if not flds:
        return '()'
    return f'({",".join(f"{obj_name}.{f.name}" for f in flds)},)'

def match_impl(flds):
    match flds:
        case [single_fld]:
            self_expr = f"self.{single_fld.name}"
            other_expr = f"other.{single_fld.name}"
        case _:
            self_expr = _tuple_str("self", flds)
            other_expr = _tuple_str("other", flds)
    return self_expr, other_expr

def ifelse_impl(flds):
    if len(flds) == 1:
        self_expr = f"self.{flds[0].name}"
        other_expr = f"other.{flds[0].name}"
    else:
        self_expr = _tuple_str("self", flds)
        other_expr = _tuple_str("other", flds)
    return self_expr, other_expr

def run_benchmark():
    single_field = [MockField("value")]
    multi_field = [MockField("x"), MockField("y"), MockField("z")]
    iterations = 1_000_000

    # Just to make sure that I am using the local build
    print("Python:", sys.version)
    print("Executable:", sys.executable)
    print()

    print("Single-field case")
    print("match/case:", timeit.timeit(lambda: match_impl(single_field), number=iterations))
    print("if/else:", timeit.timeit(lambda: ifelse_impl(single_field), number=iterations))

    print("Multi-field case")
    print("match/case:", timeit.timeit(lambda: match_impl(multi_field), number=iterations))
    print("if/else:", timeit.timeit(lambda: ifelse_impl(multi_field), number=iterations))

    print("Real dataclass comparison (sanity check)")
    @dataclass(order=True)
    class A:
        x: int

    @dataclass(order=True)
    class B:
        x: int
        y: int
        z: int

    a1, a2 = A(1), A(2)
    b1, b2 = B(1, 2, 3), B(2, 3, 4)

    print("single-field:", timeit.timeit(lambda: a1 < a2, number=iterations))
    print("multi-field: ", timeit.timeit(lambda: b1 < b2, number=iterations))

if __name__ == "__main__":
    run_benchmark()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants