Add Google Highway SIMD acceleration for ImageBufAlgo operations #4986
+1,388
−1,869
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
perf(IBA): Add Google Highway SIMD fast paths for core operations
Implement SIMD acceleration using Google Highway for ImageBufAlgo add, sub,
mul, pow, and resample operations. Provides 2-8x performance improvement for
contiguous pixel layouts with portable vectorization across x86 and ARM.
New file imagebufalgo_hwy_pvt.h provides reusable SIMD infrastructure with
automatic type promotion/demotion (uint8/uint16/int16/uint32/half/float/double),
generic operation kernels, and smart fallback to scalar code for strided layouts.
Operations use runtime vector width detection (ScalableTag), FMA instructions
where applicable, and handle partial vectors correctly. Code follows OIIO style
with modern C++ casts and comprehensive documentation.
Requires: Google Highway library (MIT license, header-only)
Modified: imagebufalgo_{addsub,muldiv,pixelmath,xform}.cpp + new hwy_pvt.h
Checklist:
behavior.
testsuite.
PR, by pushing the changes to my fork and seeing that the automated CI
passed there. (Exceptions: If most tests pass and you can't figure out why
the remaining ones fail, it's ok to submit the PR and ask for help. Or if
any failures seem entirely unrelated to your change; sometimes things break
on the GitHub runners.)
fixed any problems reported by the clang-format CI test.
corresponding Python bindings. If altering ImageBufAlgo functions, I also
exposed the new functionality as oiiotool options.