Strip backslash from OSM names (and fix provisioning and change analysis parameters) #947

KlaasH · 2023-05-12T20:49:58Z

Overview

The most important part of this PR is the smallest—a pair of lines in import_osm.sh that clears any backslash characters from the converted.osm file before we feed it to osm2pgrouting to import into the database. It turns out osm2pgrouting really doesn't like backslash characters. The error looks like this:

That certainly looks like an error, but I had assumed it would only affect the actual segments with bad names. But it turns out where it says "Vertices inserted: 0 Split ways inserted 0" after the errors, that's an indication that it lost a lot more than the affected segments. Apparently it processes the input in chunks, and whatever other segments happen to be in the same chunk as the bad segments just get dropped on the floor.

I don't believe there's any functional purpose for which we would want backslashes in our OSM input. They're not used for encoding (the file is in UTF-8) and we don't actually use segment names for anything in the analysis anyway. So I just did a simple sed 's/\\/backslash/' to make them go away.

Other items:

The next most important thing this does is increase the memory available to the analysis task from 32GB to 61GB. It only matters for Los Angeles, I think, but it does seem to matter for Los Angeles. And we're using i3.2xlarge instances anyway, so we might as well allocate all the memory.
I also adjusted some of the parameters used to set PostgreSQL memory and disk usage limits. We have lots of memory and lots of disk, so it makes sense to be generous, to get all the speed we can and avoid any failures in the largest jobs.
Speaking of those parameters, I made the staging ones match production. They had been set lower, presumably to make repeated testing cheaper, but we only very occasionally test on staging these days, so it makes sense to make it as similar to production as possible.
There were a few errors in the build when I tried to spin my development instance back up. I tried to find the path of least resistance for getting around them, which meant upgrading the base image in the case of the tilegarden container but just adjusting the Debian package list to be able to keep using the old image for the angularjs container.

Demo

I found the location of one of the segments named "\" and made a tiny boundary centered on it. Here's are the before-and-after shots showing the affect of losing chunks from the OSM import:
Before:

After:

Notes

The GitHub Actions deployment to staging isn't working (see #937), but I pushed this branch to staging manually.

Testing Instructions

I don't think it seems necessary to spin up a Batch worker on staging and do more runs there, but it probably makes sense to test this locally to the point of running a small analysis.

Spin up a dev instance per the README instructions
Make a neighborhood for this boundary (via the "Neighborhoods" management page): backslash_location.zip (use FIPS code 3651000)
Run an analysis for it (click "Run Analysis" on the Jobs page then copy-paste the command that gets printed into the Docker log into a shell within the VM).
Confirm that it runs successfully and you can see the results by clicking on it in the All places list once it has finished.

Checklist

Add entry to CHANGELOG.md

Resolves #946

Fixes a few errors that have showed up in the container provisioning due to everything being pretty old. Specifically: - Debian 'stretch' is out of support. - I changed the tilegarden container to use buster, because it's already on a version of Node for which there's a 'buster' base image - The AngularJS container wouldn't build with the newer Node versions I tried, so I left it on 'stretch' and updated the apt sources list to "archive", where the packages live now. - The Bower registry is using an SSL cert based on a root certificate that 'stretch' doesn't have. I briefly tried updating the certs, but settled on just setting Bower to "strict-ssl: false"

This is something that we started doing on more recent projects and it's nice to have, since uploading the modified tfvars file is something we need to do frequently and composing the command by hand is kind of annoying.

We've been running this job with 8 vCPUs but only 31 GiB of RAM, which doesn't make a ton of sense because that's all the CPU but only half the RAM of an i3.2xlarge instance. To be fair, we didn't think we would actually get close to filling up even the 31 GiB, but apparently we are on the largest jobs, so it's time to allocate the rest. I also: - Increased the parameters we apply to the PostgreSQL instance within the analysis container, to make more use of the copious resources we're providing to these tasks. Some of the values are probably overkill, but there's no reason not to use the resources, and some could make the jobs faster or more robust. - Matched all the parameters in the staging job definiton to the ones in production. We seldom run staging jobs, so making them smaller doesn't appreciably help with keeping costs down, and it seems valuable to make the test environment as similar to production as we practically can.

Sets the size of /dev/shm (a shared-memory RAM drive built into Linux) to 4GB via the Batch task definition. Apparently Postgres uses it for some tasks (like VACUUM ANALYZE) and has been running into space constraints (that's what the "ERROR: could not resize shared memory segment" errors are about).

It turns out osm2pgrouting *really* doesn't like it if you have a backslash in a segment name. It produces an error that knocks out not just the segment in question, but everything that the tool would have otherwise loaded in the chunk that it's processing. Here's a Stack Overflow about it: https://gis.stackexchange.com/a/12873 but the recommendation there is to modify the osm2pgrouting code. I just added a command to the import_osm.sh script to use 'sed' to strip out any backslash characters.

KlaasH · 2023-05-12T20:51:07Z

src/analysis/scripts/setup_database.sh

@@ -39,7 +39,7 @@ ALTER SYSTEM SET maintenance_work_mem TO '${PFB_MAINTENANCE_WORK_MEM}';
 ALTER SYSTEM SET max_wal_size TO '${PFB_MAX_WAL_SIZE}';
 ALTER SYSTEM SET shared_buffers TO '${PFB_SHARED_BUFFERS}';
 ALTER SYSTEM SET synchronous_commit TO ${PFB_SYNCHRONOUS_COMMIT};
-ALTER SYSTEM SET temp_file_limit TO ${PFB_TEMP_FILE_LIMIT};
+ALTER SYSTEM SET temp_file_limit TO '${PFB_TEMP_FILE_LIMIT}';


This had been a (large) numerical value, but changing it to "100GB" made it start breaking without quotes.

KlaasH · 2023-05-12T20:53:43Z

deployment/aws-batch/requirements.txt

@@ -1 +1 @@
-boto3==1.4.4
+boto3==1.26.132


This is the requirements file for the little script that updates the batch task definition. Long-term we want to get rid of it and handle the task definition in Terraform, but for now we're still using it. This upgrade was necessary because the very-old version of boto3 didn't know about the linuxParameters option.

ddohler

Everything seems to be working great! I wasn't able to actually view the details for the job due to an apparent CORS error, but the job did run successfully (once I got my VM resuscitated).

ddohler · 2023-05-15T12:03:34Z

deployment/aws-batch/job-definitions/production-pfb-analysis-run-job.json

        }, {
            "name": "PFB_WORK_MEM",
-            "value": "3072MB"
+            "value": "3GB"


This seems like it's the same as it was, just in GB rather than MB -- was that intentional?

Yeah, I was going to raise that one,too, but it's the one that gets allocated for every thread in parallel processes so it can actually balloon pretty fast. So I left it, but still changed the format to match the others.

KlaasH added 7 commits May 5, 2023 15:56

Add an 'update-tfvars' command to the 'infra' script

127ce1b

This is something that we started doing on more recent projects and it's nice to have, since uploading the modified tfvars file is something we need to do frequently and composing the command by hand is kind of annoying.

Put temp_file_limit param in quotes

f473b72

Update CHANGELOG for analysis fixes

893fe48

KlaasH commented May 12, 2023

View reviewed changes

KlaasH requested a review from ddohler May 12, 2023 20:54

KlaasH assigned ddohler May 12, 2023

ddohler approved these changes May 15, 2023

View reviewed changes

KlaasH merged commit 54b27a9 into develop May 15, 2023

KlaasH deleted the feature/kjh/strip-backslash-from-osm-names branch May 15, 2023 20:29

KlaasH mentioned this pull request Jun 15, 2023

Increase analysis RAM allocation and fix provisioning #945

Closed

1 task

KlaasH mentioned this pull request Dec 12, 2023

Batch processing sometimes fails #940

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Strip backslash from OSM names (and fix provisioning and change analysis parameters) #947

Strip backslash from OSM names (and fix provisioning and change analysis parameters) #947

KlaasH commented May 12, 2023 •

edited

Loading

KlaasH May 12, 2023

KlaasH May 12, 2023

ddohler left a comment

ddohler May 15, 2023

KlaasH May 15, 2023

		@@ -1 +1 @@
		boto3==1.4.4
		boto3==1.26.132

Strip backslash from OSM names (and fix provisioning and change analysis parameters) #947

Strip backslash from OSM names (and fix provisioning and change analysis parameters) #947

Conversation

KlaasH commented May 12, 2023 • edited Loading

Overview

Demo

Notes

Testing Instructions

Checklist

KlaasH May 12, 2023

Choose a reason for hiding this comment

KlaasH May 12, 2023

Choose a reason for hiding this comment

ddohler left a comment

Choose a reason for hiding this comment

ddohler May 15, 2023

Choose a reason for hiding this comment

KlaasH May 15, 2023

Choose a reason for hiding this comment

KlaasH commented May 12, 2023 •

edited

Loading