Skip to content

feat: Optimize hash util for MapArray#20179

Open
jonathanc-n wants to merge 2 commits intoapache:mainfrom
jonathanc-n:optimize-hash
Open

feat: Optimize hash util for MapArray#20179
jonathanc-n wants to merge 2 commits intoapache:mainfrom
jonathanc-n:optimize-hash

Conversation

@jonathanc-n
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

Reduce the irrelevant data being used to hash for MapArray

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added the common Related to common crate label Feb 6, 2026
@jonathanc-n
Copy link
Contributor Author

@Jefffrey I'm not sure to also do this optimization for UnionArray.

I was thinking of doing a take() operation on the children array using the offsets. but I don't know if its worth the allocation overhead. What exact strategy were you thinking of for union array?

Copy link
Member

@xudong963 xudong963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a micro benchmark for the PR?

Or do you need me to enable the general benchmark in datafusion?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common Related to common crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimize hash util map/union functions to hash only needed values

2 participants