Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Dataset.rename_dims creates a new dimension without coordinates #6867

Closed
ZhaJiMan opened this issue Aug 2, 2022 · 2 comments
Closed

Comments

@ZhaJiMan
Copy link

ZhaJiMan commented Aug 2, 2022

What is your issue?

The documentation says Dataset.rename_dims will returns a new object with renamed dimensions only. In my work I am intended to rename the longitude and latitude dimensions: ds = ds.rename_dims(longitude='lon', latitude='lat'). But it turned out that a new dimension lon was created with values from 0 to its length minus 1. Is this result the expected behaviour of rename_dims, or I misused this method?

A simple case below:

import numpy as np
import xarray as xr

da = xr.DataArray([1, 2, 3], coords=[('space', list('abc'))])
ds = xr.Dataset({'x': da})

The information of ds in repl is

<xarray.Dataset>
Dimensions:  (space: 3)
Coordinates:
  * space    (space) <U1 'a' 'b' 'c'
Data variables:
    x        (space) int32 1 2 3

But after ds = ds.rename_dims(space='label')

<xarray.Dataset>
Dimensions:  (label: 3)
Coordinates:
    space    (label) <U1 'a' 'b' 'c'
Dimensions without coordinates: label
Data variables:
    x        (label) int32 1 2 3

space became a non-coordinate dimension, and label was created as a new dimension without coordinates.

Environment:
numpy : 1.22.3
pandas : 1.4.2
xarray : 2022.3.0

@ZhaJiMan ZhaJiMan added the needs triage Issue that has not been reviewed by xarray team member label Aug 2, 2022
@benbovy
Copy link
Member

benbovy commented Aug 2, 2022

Hi @ZhaJiMan, this topic has been brought up several times recently (see, e.g., #4825, #6607, #6704).

I can't tell much about your latitude / longitude case without a more detailed example, but looking at your simple case the result is the one I would expect, i.e., rename_dims renames only the dimension (not the coordinate).

Note that with version 2022.6.0, the space coordinate keeps its index so that you can still use it with .sel():

renamed = ds.rename_dims(space='label'))

renamed
# <xarray.Dataset>
# Dimensions:  (label: 3)
# Coordinates:
#   * space    (label) <U1 'a' 'b' 'c'
# Dimensions without coordinates: label
# Data variables:
#     x        (label) int64 1 2 3

renamed.sel(space="a")
# <xarray.Dataset>
# Dimensions:  ()
# Coordinates:
#     space    <U1 'a'
# Data variables:
#     x        int64 1

Given that we can now use non-dimension coordinates for data selection, we should probably remove the "Dimensions without coordinates: label" in the repr for such case, as it becomes rather confusing. We should probably change that line to something like "Dimensions without index" or "Dimensions without indexed coordinates" or address that in the indexes repr section (#6795).

@ZhaJiMan
Copy link
Author

ZhaJiMan commented Aug 2, 2022

Thanks for detailed explanation and informing of changes in recent versions. Although this behaviour still confuses me a lot, I'll take some time reading those mentioned issues.

@dcherian dcherian removed the needs triage Issue that has not been reviewed by xarray team member label Sep 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants