Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An Enclave inside an Enclave won't make it through conversion #35

Open
eseglem opened this issue Oct 1, 2022 · 10 comments
Open

An Enclave inside an Enclave won't make it through conversion #35

eseglem opened this issue Oct 1, 2022 · 10 comments

Comments

@eseglem
Copy link

eseglem commented Oct 1, 2022

Its very much an edge case, but if you have a Multi Polygon which had a part inside a hole of itself that part does not end up in the output geojson. It happens at least one place in the world: Baarle-Nassau which shows up all the way up through the Netherlands country level.

Pull the XML from Overpass with rel(id:2718379);out geom; and the output geojson will not contain the 7 internal enclaves. Nederlandse enclave N1-7 (relations 13193251 - 13193257).

Took me quite a while to track through and understand exactly what was happening. Unfortunately, since all the outers and inner's are grouped together, when _convert_lines_to_multipolygon is called on outer the unary_union eats all of them. Since they are inside the outermost boundary. So its technically doing what its supposed to be doing.

Looking through the XML, they seem to be in order. 16 outer, 25 inner, 8 more outer. But I have no idea if that is guaranteed. Not the cleanest code, but something like this appears to work better this specific polygon. But only works in general if the order is guaranteed, and probably needs some more safety checks as noted.

def multipolygon_relation_to_shape(
        rel, refs_index,
        area_keys: Optional[dict] = None, polygon_features: Optional[list] = None
):
    # List of Tuple (role, multipolygon)
    shapes = []

    if 'members' in rel:
        members = rel['members']
    else:
        found_ref = get_ref(rel, refs_index)
        if not found_ref:
            error('Ref for multipolygon relation not found in index', pformat(rel))
            return None
        members = found_ref['members']

    for member in members:
        if member['type'] != 'way':
            warning('Multipolygon member not handled', pformat(member))
            continue

        member['used'] = rel['id']

        way_shape = way_to_shape(member, refs_index, area_keys, polygon_features)
        if way_shape is None:
            # throw exception
            warning('Failed to make way', pformat(member), 'in relation', pformat(rel))
            continue

        if isinstance(way_shape['shape'], Polygon):
            way_shape['shape'] = LineString(way_shape['shape'].exterior.coords)

        shapes.append((member['role'],way_shape['shape']))  

    # Intermediate groups
    groups = []
    # New group each time it switches role
    for role, group in itertools.groupby(shapes, lambda s: s[0]):
        groups.append((role, _convert_lines_to_multipolygon([_[1] for _ in group])))

    # Grab the first one. Should be "outer", may need safety here?
    multipolygon = groups[0][1]
    # Itterate over the rest if there are any
    for role, mp in groups[1:]:
        if role == "inner":
            multipolygon = multipolygon.difference(mp)
        else:  # I think it should only be "outer", but more safety?
            multipolygon = multipolygon.union(mp)

    if multipolygon is None:
        warning('Relation not converted to feature', pformat(rel))
        return None

    multipolygon = fix_invalid_polygon(multipolygon)
    multipolygon = to_multipolygon(multipolygon)
    multipolygon = orient_multipolygon(multipolygon)  # do we need this?

    return {
        'shape': multipolygon,
        'properties': get_element_props(rel)
    }
@eseglem eseglem changed the title Unary Union causes Enclave inside Enclave to be missing An Enclave inside an Enclave won't make it through conversion Oct 1, 2022
@rapkin rapkin closed this as completed in afa7ae4 Oct 2, 2022
@rapkin
Copy link
Collaborator

rapkin commented Oct 2, 2022

@eseglem Thank you for your work (and code example provided)! I made a few changes in your code and added test for this bug.

@eseglem
Copy link
Author

eseglem commented Oct 3, 2022

@rapkin Would you mind creating a new release with the fix?

Although I just noticed the test may have failed on github actions. Let me know if I can help with anything.

@rapkin
Copy link
Collaborator

rapkin commented Oct 3, 2022

@eseglem Here release with that change https://pypi.org/project/osm2geojson/0.2.1/. Thank you for noticing that problem with github actions. I think it's because of different shapely version (I have older on my computer). Geometry produced with both versions looks the same (but order of polygons in multipolygon is different)

@eseglem
Copy link
Author

eseglem commented Oct 3, 2022

Thanks for being so responsive.

Unfortunately, what I was worried about seems to be a problem. Order does not appear to be guaranteed. I just pulled one where the first member was inner. So it returns empty features. Worked out some potentially better logic. Running through some more comprehensive testing now.

@rapkin
Copy link
Collaborator

rapkin commented Oct 3, 2022

Ok, I'll take a look on that tomorrow. But I need this 'empty' example (you can send me whole XML/JSON content from Overpass)

@rapkin rapkin reopened this Oct 3, 2022
@eseglem
Copy link
Author

eseglem commented Oct 3, 2022

By empty I mean it ends up {"type: "FeatureCollection", "features": []} because a None gets returned somewhere in there.

Serbia: rel(id:1741311);out geom; is the first one I ran into with inner first. I added logic that worked for that one, but sadly the logic fails for Kyrgyzstan rel(id:178009);out geom; The first outer group is self intersecting on its own. I will keep digging but running out of good ideas for how to handle the weird cases.

@rapkin
Copy link
Collaborator

rapkin commented Oct 3, 2022

That's what I need, thank you!

@eseglem
Copy link
Author

eseglem commented Oct 4, 2022

I found this: https://github.com/tyrasd/osmtogeojson/blob/gh-pages/index.js#L746 It powers overpass-turbo, and it appears to work for all of the weird ones. But didn't get to work through the code and see how it works.

@rapkin
Copy link
Collaborator

rapkin commented Nov 6, 2022

@eseglem I hope this fix 024c06b should help with your problem (it still may not work with other examples, becuase this solution doesn't look reliable). I read that code in osmtogeojson package and made some progress on that in branch work-on-enclave-problem, but my result was not successful (so I made a pause with that).

PS. Sorry for very late response, it was hard month in Ukraine.

@pijll
Copy link

pijll commented May 25, 2023

I still have the same problem. I found this when converting the same OSM objects as in the original report, Baarle-Nassau (Netherlands).

I've been able to boil the problem down to a simple example. Given an OSM-json file with nested enclaves:

{"elements": [
  {"type":  "relation", "id":  1, "members":  [
    {"type": "way", "ref": 11, "role": "outer", "_comment": "boundary (outermost)"},
    {"type": "way", "ref": 12, "role": "outer", "_comment": "enclave within enclave"},
    {"type": "way", "ref": 13, "role": "inner", "_comment": "enclave"}
  ],"tags": {"boundary": "administrative"}},
  {"type": "node", "id":  111, "lat": 0, "lon": 0},
  {"type": "node", "id":  112, "lat": 5, "lon": 0},
  {"type": "node", "id":  113, "lat": 5, "lon": 5},
  {"type": "node", "id":  114, "lat": 0, "lon": 5},
  {"type": "node", "id":  121, "lat": 2, "lon": 3},
  {"type": "node", "id":  122, "lat": 3, "lon": 3},
  {"type": "node", "id":  123, "lat": 3, "lon": 2},
  {"type": "node", "id":  124, "lat": 2, "lon": 2},
  {"type": "node", "id":  131, "lat": 1, "lon": 4},
  {"type": "node", "id":  132, "lat": 4, "lon": 4},
  {"type": "node", "id":  133, "lat": 4, "lon": 1},
  {"type": "node", "id":  134, "lat": 1, "lon": 1},
  {"type": "way", "id": 11, "nodes": [111, 112, 113, 114, 111]},
  {"type": "way", "id": 12, "nodes": [121, 122, 123, 124, 121]},
  {"type": "way", "id": 13, "nodes": [131, 132, 133, 134, 131]}
]}

Note that the last two members of the relation are in the wrong order, logically.

Converting this with osm2geojson.json2shapes results in just a square with a single enclave. When the second and third members of the relation are swapped, the result is correct (square, containing an enclave, which contains a counter-enclave). The result should of course be independent of the order of the members.

Tests:

    # This test fails
    def test_nested_polygons_wrong_order(self):
        json_file = pathlib.Path('nested_polygons.json')
        json_object = json.loads(json_file.read_text())

        shapes = osm2geojson.json2shapes(json_object)
        multipoly = shapes[0]['shape']

        self.assertEqual(len(multipoly.geoms), 2)
        self.assertTrue(multipoly.contains(shapely.Point(2.5, 2.5)))

    # This test succeeds
    def test_nested_polygons_correct_order(self):
        json_file = pathlib.Path('nested_polygons.json')
        json_object = json.loads(json_file.read_text())
        members = json_object['elements'][0]['members']
        members[1], members[2] = members[2], members[1]

        shapes = osm2geojson.json2shapes(json_object)
        multipoly = shapes[0]['shape']

        self.assertEqual(len(multipoly.geoms), 2)
        self.assertTrue(multipoly.contains(shapely.Point(2.5, 2.5)))

A more complicated test would be the town of Baarle-Nassau: when converting OSM relation 47798, the little enclave given by OSM way 35145945 is not included in the result; when converting OSM relation 2718379, that same enclave is included. The two relations cover mostly the same area, but the order of the ways differs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants