Open
Conversation
Co-authored-by: Cursor <cursoragent@cursor.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces a HIP VMem-based allocator path for Iris’ symmetric heap, adds an as_symmetric() workflow for importing external PyTorch tensors via DMA-BUF into controlled virtual address space, and adds an extensive test suite covering ROCr/HIP behaviors and VMem/DMA-BUF workflows.
Changes:
- Add
VMemAllocatorand extendSymmetricHeap/Iristo supportallocator_typeselection and external-tensor import (as_symmetric). - Add HIP VMem primitives + external-memory lifetime APIs (e.g.,
destroy_external_memory) and DMA-BUF offset-handling tests. - Add many unit tests for segmented export/import, cumulative
mem_set_accessworkaround, peer exchange, and end-to-end imported-tensor usage; disable a flaky float32 matmul config.
Reviewed changes
Copilot reviewed 26 out of 27 changed files in this pull request and generated 21 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unittests/test_vmem_segmented_export.py | New test for segmented DMA-BUF export/import across multiple VMem allocations. |
| tests/unittests/test_vmem_peer_dmabuf_exchange.py | New tests for peer DMA-BUF exchange (single-rank simulation + multi-rank). |
| tests/unittests/test_vmem_offset_check.py | New multi-rank test checking imported-tensor offset symmetry. |
| tests/unittests/test_vmem_multi_alloc_export.py | New test documenting failure/limitation when exporting multi-allocation VA ranges as one DMA-BUF. |
| tests/unittests/test_vmem_minimal_export_with_ondemand.py | New tests for “minimal export + on-demand mapping” allocator strategy. |
| tests/unittests/test_vmem_imported_tensor_rma.py | New end-to-end tests using imported tensors with Triton kernels and Iris load/RMA. |
| tests/unittests/test_vmem_cumulative_access.py | New tests validating ROCm cumulative mem_set_access workaround. |
| tests/unittests/test_vmem_allocator.py | New unit tests for creating/using VMem allocator and importing tensors. |
| tests/unittests/test_rocr_behaviors.py | New tests codifying ROCr/DMA-BUF behaviors Iris relies on (base-allocation export, cumulative access). |
| tests/unittests/test_pytorch_import_mechanism.py | New tests for offset-preserving DMA-BUF import mechanism used by as_symmetric(). |
| tests/unittests/test_pytorch_dmabuf_export.py | New tests for exporting PyTorch allocations to DMA-BUF (including slices/dtypes). |
| tests/unittests/test_hip_vmem_primitives.py | New tests for low-level HIP VMem reserve/map/set_access/unmap primitives. |
| tests/unittests/test_hip_apis.py | New tests for HIP API helpers like get_address_range() across dtypes/shapes. |
| tests/unittests/test_dmabuf_vmem_import.py | New tests for importing DMA-BUF into reserved VMem VA ranges and mixing with native allocations. |
| tests/unittests/test_dmabuf_controlled_va_import.py | New tests for controlled-VA DMA-BUF import including offset preservation and cleanup. |
| tests/unittests/test_dmabuf_apis.py | Update DMA-BUF import tests to handle external-memory lifetime (destroy_external_memory) and new return type. |
| tests/ops/test_matmul_all_gather.py | Disable float32 test case due to AMD Triton backend issue. |
| iris/tensor_utils.py | New helper for wrapping raw GPU pointers via __cuda_array_interface__. |
| iris/symmetric_heap.py | Add allocator selection + new segmented peer-refresh workflow + as_symmetric() plumbing. |
| iris/iris.py | Add allocator_type to public API + as_symmetric() method + best-effort destructor cleanup. |
| iris/hip.py | Extend HIP bindings: external-memory destroy API, get_address_range, and full HIP VMem API surface. |
| iris/allocators/vmem_allocator.py | New VMem allocator implementation using HIP VMem + DMA-BUF import of external tensors. |
| iris/allocators/torch_allocator.py | Minor updates; retains FD exchange code paths for torch-backed allocator. |
| iris/allocators/base.py | Simplify allocator base interface; move multi-rank coordination responsibility into SymmetricHeap. |
| iris/allocators/init.py | Export VMemAllocator. |
| iris/init.py | Re-export tensor utils and tidy imports/exports. |
| .gitignore | Ignore resources/log outputs. |
Comments suppressed due to low confidence (1)
iris/iris.py:69
- The docstring says allocator_type default is 'torch', but the actual default argument is 'vmem'. Please align the documentation with the actual default (or change the default to match the documented behavior) to avoid surprising API changes for callers.
from iris.symmetric_heap import SymmetricHeap
import numpy as np
import math
import torch
import logging
# Import logging functionality from the separate logging module
from .logging import logger
# Import tracing functionality
from .tracing import Tracing, TraceEvent, DeviceTracing # noqa: F401 re-export for iris.TraceEvent
class Iris:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Technical Details
Test Plan
Test Result
Submission Checklist