Skip to content

Conversation

@konard
Copy link
Member

@konard konard commented Sep 14, 2025

Summary

This PR implements System.Runtime.Intrinsics support for BitString operations as requested in issue #59, providing hardware-accelerated SIMD operations for better performance.

Changes

Core Implementation

  • Added intrinsics-based methods: IntrinsicsNot(), IntrinsicsAnd(), IntrinsicsOr(), IntrinsicsXor() and their parallel variants
  • AVX2 and SSE2 support: Automatically detects and uses AVX2 when available, falls back to SSE2, then to regular operations
  • Unsafe optimizations: Uses unsafe fixed pointers for direct memory access with SIMD instructions
  • Smart fallbacks: Gracefully handles systems without intrinsics support

Performance Features

  • AVX2 operations: Process 4 long values (256 bits) simultaneously when supported
  • SSE2 operations: Process 2 long values (128 bits) simultaneously as fallback
  • Parallel variants: Combine intrinsics with parallel processing for maximum throughput
  • Optimized loops: Vectorized processing with scalar cleanup for remainder elements

Testing & Validation

  • Comprehensive test coverage: All intrinsics methods tested for correctness against regular operations
  • Benchmark integration: Added benchmark methods to measure performance improvements
  • Experiment validation: Created test script that verifies intrinsics produce identical results

Technical Details

  • Hardware detection: Runtime checks for Avx2.IsSupported and Sse2.IsSupported
  • Memory safety: Proper fixed pointer usage with bounds checking
  • Border optimization: Maintains existing border tracking optimizations
  • Thread safety: Parallel variants use proper partitioning

Test Results

All existing tests pass plus new intrinsics-specific tests. The experiment script confirms:

  • ✅ All operations produce identical results to regular implementations
  • ✅ Proper fallback behavior when AVX2/SSE2 not available
  • ✅ Correct handling of different BitString sizes

Performance Impact

The intrinsics implementations should provide significant performance improvements:

  • AVX2: Up to 4x faster for large BitString operations
  • SSE2: Up to 2x faster when AVX2 not available
  • Parallel: Additional speedup on multi-core systems
  • No regression: Falls back to existing implementations when needed

Usage

var bitString = new BitString(100000);
// Use intrinsics-accelerated operations
bitString.IntrinsicsNot();
bitString.IntrinsicsAnd(otherBitString);
bitString.ParallelIntrinsicsOr(otherBitString);

This implementation fulfills issue #59's request to explore System.Runtime.Intrinsics for BitString performance improvements.

🤖 Generated with Claude Code


Resolves #59

Adding CLAUDE.md with task information for AI processing.
This file will be removed when the task is complete.

Issue: #59
@konard konard self-assigned this Sep 14, 2025
- Add intrinsics-based implementations using AVX2 and SSE2 for:
  * IntrinsicsNot() and ParallelIntrinsicsNot()
  * IntrinsicsAnd() and ParallelIntrinsicsAnd()
  * IntrinsicsOr() and ParallelIntrinsicsOr()
  * IntrinsicsXor() and ParallelIntrinsicsXor()

- Intrinsics provide hardware-accelerated SIMD operations for better performance
- Automatically detects and uses AVX2 when available, falls back to SSE2
- Falls back to regular implementations when intrinsics not supported
- Added comprehensive test coverage for all intrinsics methods
- Added benchmark methods to compare performance with existing implementations
- Added experiment script to verify intrinsics functionality

All tests pass and intrinsics produce identical results to regular operations.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@konard konard changed the title [WIP] Try to use System.Runtime.Intrinsics for BitString Implement System.Runtime.Intrinsics for BitString operations Sep 14, 2025
@konard konard marked this pull request as ready for review September 14, 2025 04:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Try to use System.Runtime.Intrinsics for BitString

2 participants