Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AsyncGroup.create_array() got an unexpected keyword argument 'compression' in netcdf3.py module #534

Open
wrongkindofdoctor opened this issue Jan 13, 2025 · 3 comments · May be fixed by #535

Comments

@wrongkindofdoctor
Copy link

I'm trying to create fsspec files for netCDF 3 datasets on a local filesystem, but am encountering an error during the json reference file creation process:

AsyncGroup.create_array() got an unexpected keyword argument 'compression'

The error is coming from the netCDF3.py module NetCDF3toZarr translate method in the following block:

     arr = z.create_dataset(
                    name=dim,
                    shape=shape,
                    dtype=var.data.dtype,
                    fill_value=fill,
                    chunks=shape,
                    compression=None,
                )

I'm using Kerchunk v0.2.7.

compression is not included in the create_dataset parameter list according to the zarr documentation, but compressor is, so fixing the argument in this and other create_dataset calls should presumably solve the issue.

Steps to reproduce:

def write_fsspec(fs_read, input_file, output_dir):
    with fs_read.open(input_file) as infile:
        print(f"Running kerchunk generation for {input_file}...")
        chunks = kerchunk.netCDF3.NetCDF3ToZarr(infile, inline_threshold=300)
        file_name = os.path.basename(input_file)
        file_name = file_name.replace('.nc', '.json')
        out_file_name = output_dir + '/' + file_name
        with open(out_file_name, "wb") as f:
            f.write(json.dumps(chunks.translate()).encode()) # call to netCDF3.py originates here
        print(f"Finished writing {out_file_name}")
        return out_file_name

 dir_path = config['input_dir']
 dir_path += '**/*.nc'
 file_paths = glob.glob(dir_path, recursive=True)
 fs_read = fsspec.filesystem('local')
 temp_dir = TemporaryDirectory(prefix=config['output_dir'])
 output_files = [write_fsspec(fs_read, f, temp_dir.name) for f in file_paths]
@martindurant
Copy link
Member

What version of zarr do you have? You may need to downgrade to <3.0. We know that there is some work needed in this repo to make it compatible with the newely-released major version of zarr.

@wrongkindofdoctor
Copy link
Author

@martindurant Zarr is v3.0.0. I will try downgrading and report back.

@wrongkindofdoctor
Copy link
Author

wrongkindofdoctor commented Jan 13, 2025

@martindurant Downgrading to Zarr v2.18.4 fixed the issue, though I am now encountering other errors. I'm not sure if these are fsspec or file metadata problems (dataset does not adhere to cmip conventions), but I'll open a different issue if it's something else in fsspec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants