Skip to content

Comments

Add read-only mmap support (mmap_mode="r") for DictStore, TreeStore, and EmbedStore#585

Open
Karol-G wants to merge 4 commits intoBlosc:mainfrom
Karol-G:feat/store_container_mmap_support
Open

Add read-only mmap support (mmap_mode="r") for DictStore, TreeStore, and EmbedStore#585
Karol-G wants to merge 4 commits intoBlosc:mainfrom
Karol-G:feat/store_container_mmap_support

Conversation

@Karol-G
Copy link

@Karol-G Karol-G commented Feb 19, 2026

Summary

This PR adds initial read-only memory-mapping support to store containers via mmap_mode="r".

Supported containers:

  • DictStore (.b2d, .b2z)
  • TreeStore (via DictStore inheritance)
  • EmbedStore (.b2e)

Changes

  • Added keyword-only mmap_mode to DictStore and EmbedStore.
  • Propagated mmap_mode through read/open paths, including zip-offset opens.
  • Enabled blosc2.open(..., mmap_mode="r") for store containers through special-store forwarding.
  • Updated reference docs for DictStore, TreeStore, and EmbedStore with mmap usage and constraints.
  • Added a release-notes entry for the feature.

Validation Rules

  • Only None or "r" are allowed.
  • mmap_mode="r" requires mode="r".
  • Invalid combinations raise ValueError.

Tests

Added/updated tests for:

  • mmap-backed reads in DictStore / TreeStore / EmbedStore
  • blosc2.open(..., mmap_mode="r") on store containers
  • validation errors for unsupported modes and invalid mode combinations

Notes

  • Backward compatible when mmap_mode is not used.
  • Follow-up PRs can add r+ / c support and optional handle reuse/caching for repeated __getitem__ access.

Enables memory mapping for DictStore and EmbedStore containers to improve read access performance.

This enhancement allows opening store container files (b2z, b2d, b2e) in read-only mode using memory mapping, potentially reducing memory usage and improving read speeds. It introduces an optional `mmap_mode` parameter with "r" as the only supported value.

Also, adds validation to ensure mmap_mode is only "r" or None, and that it is only used when mode is "r".
DictStore/TreeStore/EmbedStore docs updated | note current limits (only "r", requires mode="r") | add release-notes entry
Accept formatter-only tuple-yield rewrite in DictStore.items(); no functional change.
@lshaw8317
Copy link
Collaborator

Looks mostly good to me. The only thing extra I would ask for is to add a benchmark to see how mmap has improved read times, but only if you have got time. Thanks for your contribution!

@Karol-G
Copy link
Author

Karol-G commented Feb 19, 2026

I will check if I find the time for it tomorrow.

Introduces a benchmark script to compare read performance between regular and memory-mapped read paths for different store containers (EmbedStore, DictStore, TreeStore).

This allows for evaluating the impact of mmap on read throughput and latency under various scenarios, including warm and cold cache conditions.

The benchmark supports different data layouts (embedded, external, mixed) and generates detailed metrics such as open time, read time, throughput, and speedup ratios.
@Karol-G
Copy link
Author

Karol-G commented Feb 20, 2026

I added a dedicated benchmark for mmap read mode and ran it across EmbedStore, DictStore, and TreeStore for all supported storage/layout combinations.

Commands used:

python bench/mmap_store_read.py --scenario warm_full_scan warm_random_slices

sudo "$(python3 -c 'import sys; print(sys.executable)')" \
  bench/mmap_store_read.py \
  --scenario cold_full_scan_drop_caches cold_random_slices_drop_caches \
  --runs 5

Summary of results:

  • mmap_mode="r" consistently improves read performance for embedded payloads.
    • Warm runs: large gains, typically around ~2x and up to ~3-4x (especially full scans).
    • Cold runs: gains remain strong, typically ~1.7-1.9x for random slices and ~2.5-3.4x for full scans.
  • For mixed layouts, improvements are moderate but consistent (roughly ~1.1-1.3x).
  • For external layouts, improvements are small (roughly ~1.05-1.2x) and one case is near-neutral/slight regression (TreeStore + b2d + external + cold full scan).
  • Overall conclusion: mmap read mode provides clear and robust wins for embedded/container-local read paths, with smaller gains for external-node-heavy workloads.

I’m attaching full warm/cold benchmark outputs in text for reproducibility and detailed review.

Results:

cold_bench_results.txt

warm_bench_results.txt

@FrancescAlted
Copy link
Member

I added a dedicated benchmark for mmap read mode and ran it across EmbedStore, DictStore, and TreeStore for all supported storage/layout combinations.

Commands used:

python bench/mmap_store_read.py --scenario warm_full_scan warm_random_slices

sudo "$(python3 -c 'import sys; print(sys.executable)')" \
  bench/mmap_store_read.py \
  --scenario cold_full_scan_drop_caches cold_random_slices_drop_caches \
  --runs 5

Summary of results:

  • mmap_mode="r" consistently improves read performance for embedded payloads.

    • Warm runs: large gains, typically around ~2x and up to ~3-4x (especially full scans).
    • Cold runs: gains remain strong, typically ~1.7-1.9x for random slices and ~2.5-3.4x for full scans.
  • For mixed layouts, improvements are moderate but consistent (roughly ~1.1-1.3x).

  • For external layouts, improvements are small (roughly ~1.05-1.2x) and one case is near-neutral/slight regression (TreeStore + b2d + external + cold full scan).

  • Overall conclusion: mmap read mode provides clear and robust wins for embedded/container-local read paths, with smaller gains for external-node-heavy workloads.

I’m attaching full warm/cold benchmark outputs in text for reproducibility and detailed review.

Results:

cold_bench_results.txt

warm_bench_results.txt

Pretty cool accelerations. Are you using an NFS filesystem for that? Which are the specs of your box(es)?

@Karol-G
Copy link
Author

Karol-G commented Feb 20, 2026

These results are currently from my workstation only. Our local NFS-backed cluster is down at the moment; I’ll run the same benchmark suite there as soon as it’s back online.

The speedups observed here are encouraging, but this is still a small-scale benchmark, so real-world gains may be smaller depending on workload and environment. I’ll share a more complete update after broader testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants