Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clean_names.sf() does not recognize SHAPE column as a sfc column when it contains multiple Geometry Types #578

Closed
ar-puuk opened this issue Sep 7, 2024 · 4 comments · Fixed by #579

Comments

@ar-puuk
Copy link

ar-puuk commented Sep 7, 2024

Feature Requests/Bug Report

While geometry is a commonly used name of the sfc_GEOMETRY column within a sf class object, the sf data loaded from ESRI GeoDatabase has the sfc_GEOMETRY in a column called SHAPE (or sometimes shape). When clean_names() is used in this sf object, the function renames the SHAPE to shape which essentially breaks the sf object and throws the following error when I try to view() the SF object after cleaning the names. I am assuming that since the sfc column is renamed, R essentially does not recognize the sf object anymore.

> object <- read_sf(
+    file.path("some/file/path.gdb")
+    layer = "layer"
+    ) %>%
+    clean_names()

> view(object)
Error in st_geometry.sf(x) : 
  attr(obj, "sf_column") does not point to a geometry column.
Did you rename it, without setting st_geometry(obj) <- "newname"?

One workaround I have been using is to rename the sfc column to geometry from SHAPE before using clean_names().

> object <- read_sf(
+    file.path("some/file/path.gdb")
+    layer = "layer"
+    ) %>%
+    rename(`geometry` = `SHAPE`) %>%
+    clean_names()

> view(object)

Would it be possible to internally identify SHAPE as a sfc_GEOMETRY column if geometry doesn't exist in the sf class object such that clean_names() doesn't try to rename it and break the object?

@billdenney
Copy link
Collaborator

I think that is readily possible. Can you please provide a small file as a reproducible example?

@ar-puuk
Copy link
Author

ar-puuk commented Sep 7, 2024

I was apparently not entirely correct. The problem (to my understanding so far) is not about the name of the sfc_GEOMETRY column (SHAPE or geometry) or a source (SHP or GDB). I tried with other layers from GDB and that have SHAPE column; and I didn't get any issues.

But from what I just found out, it might have been because the sf object I am working with has multiple geometry types in a single sf object. However, it is still strange that the same error doesn't show up when the sfc column is named geometry rather than SHAPE.

This is the code:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(readr)
library(sf)
#> Linking to GEOS 3.12.1, GDAL 3.8.4, PROJ 9.3.1; sf_use_s2() is TRUE
library(janitor)
#> 
#> Attaching package: 'janitor'
#> The following objects are masked from 'package:stats':
#> 
#>     chisq.test, fisher.test

projects <- readr::read_rds("D:/Projects.rds")

st_geometry_type(projects)
#>  [1] MULTILINESTRING MULTILINESTRING MULTIPOINT      MULTIPOINT     
#>  [5] MULTILINESTRING MULTILINESTRING MULTIPOINT      MULTIPOINT     
#>  [9] MULTILINESTRING MULTILINESTRING
#> 18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT ... TRIANGLE

projects_shape <- projects %>% 
  janitor::clean_names()

View(projects_shape) # Does not work

project_shape_transformed <- projects_shape %>% 
  sf::st_transform("EPSG:3857") # Does not work
#> Error in st_geometry.sf(x): attr(obj, "sf_column") does not point to a geometry column.
#> Did you rename it, without setting st_geometry(obj) <- "newname"?

projects_geometry <- projects %>% 
  dplyr::rename(`geometry` = `SHAPE`) %>% 
  janitor::clean_names()

View(projects_geometry) # works

project_geometry_transformed <- projects_geometry %>% 
  sf::st_transform("EPSG:3857") # Works

Created on 2024-09-07 with reprex v2.1.1

Since, the problem doesn't seem to be originating from the GIS file format, rather than from having multiple geometry types, I am attaching the sf object as a zipped RDS file (cannot upload RDS directly).
Projects.zip

@ar-puuk ar-puuk changed the title clean_names.sf() does not recognize SHAPE or shape as a geometry field. clean_names.sf() does not recognize SHAPE column as a sfc column when it contains multiple Geometry Types Sep 7, 2024
@billdenney
Copy link
Collaborator

billdenney commented Sep 12, 2024

The underlying issue was that we renamed the attr(obj, "sf_column") column because it wasn't the last column name. So, I generalized the code to look at that attribute. Please let me know if that fixes the issue for you, and if so, we can merge it (assuming that the currently-running tests complete without issue).

@ar-puuk
Copy link
Author

ar-puuk commented Sep 12, 2024

@billdenney That seems to have solved the issue. Thank you so much.

@ar-puuk ar-puuk closed this as completed Sep 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants