-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Open
Labels
Description
What happened?
Hi there 👋
I was chatting with @jsignell today about chunk sizes and we came across some potentially inconsistent chunking behavior between xr.open_zarr vs xr.open_dataset(..., engine='zarr'), which I assumed would be identical.
# xr.__version__ : '2025.9.1'
# zarr.__version__ : '3.1.3'
import xarray as xr
ds = xr.tutorial.open_dataset('air_temperature', chunks={})
ds_rechunked = ds.chunk({'time':100,'lat':25, 'lon':53})
ds_rechunked.to_zarr('air_temperature.zarr', consolidated=False, zarr_format=3)
ds1 = xr.open_zarr('air_temperature.zarr', consolidated=False,chunks="auto")
ds1.chunks
# Frozen({'time': (100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, # 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 20), 'lat': (25,), 'lon': (53,)})
ds2 = xr.open_dataset('air_temperature.zarr', consolidated=False,chunks="auto")
ds2.chunks
# Frozen({'time': (2920,), 'lat': (25,), 'lon': (53,)})
from xarray.testing import assert_chunks_equal
assert_chunks_equal(ds1, ds2)
# AssertionError:
What did you expect to happen?
The same chunking behavior between xr.open_zarr(...) vs xr.open_dataset(...,engine='zarr')
jsignell