Caching Feature #126

mattmeye · 2018-09-29T09:04:06Z

Hello, I'm thinking about using aws lambda for tile generation since a few days. In my case I prefer to cache the generated tiles with cloud front and store generated tiles on a s3 bucket. With aws lambda edge I will generate (or start another process to do that) the missed tiles. At the end I have to thinking about map and tile updates. Do you plan a feature like this in the future or know a current project?
Kind regards, matt

mattdelsordo · 2018-09-29T16:21:54Z

I'm not sure if this is a feature that's being planned on, but you can definitely use Tilegarden to do this. It would look something like:

Deploy a Tilegarden instance.
Write a second lambda function that gets triggered on an S3 event (I'm not sure about the specifics here but I'm pretty sure there's a system in place for this on AWS).
With the second function: if the tile is missing at the desired spot in the bucket (or if it's out of date), fetch it from your Tilegarden instance and save it to the bucket.

I've seen some articles online that discuss Lambda interaction with S3 more in-depth, but to my knowledge there isn't a current project that handles this. I hope that helps!

KlaasH · 2018-10-01T13:47:25Z

Yeah, we are planning to add this feature, though I'm not sure what the timeline will be, exactly.

The example I know of is this: WikiWatershed/model-my-watershed#1215, which stores tiles in S3 and serves them with a CloudFront distribution that redirects to the actual tile server when a tile is missing. In that case the tile server is a Windshaft instance, which handles the redirected request and also writes the tile into the S3 bucket for next time.

Cache invalidation is tricky, and depends heavily on aspects of the data that these components won't necessarily know about. I think the above example deals with data that's updated infrequently and only by maintainers, so the cache invalidation strategy is "manually clear the bucket when necessary." The next-simplest approach would probably be adding a TTL parameter to the S3 bucket, though it's easy to imagine situations where that would be too aggressive and also ones where it would not be aggressive enough.

mattmeye · 2018-10-02T13:45:10Z

Thank you very much for your feedback and the link to this example. Currently I think it is possible to check invalidation with some logic in a lamda@edge function. I'm not an expert in the osm data yet, but I think that Im able to reach (or make it reachable) the status of invalidation in the postgres database inside a lambda edge function too. Otherwise I will try to extend the osm update process to store this information, or delete the tile by this process. I will figure out some solutions and give my feedback back. Actually I noticed that the lambda@edge functions are a lot cheaper than the normal lambda, so I will start to figure out this way.

KlaasH · 2018-11-26T19:56:52Z

The feature/kjh/s3-tile-cache makes some changes to api.js and adds additional Terraform config to get this mostly working.

The basic structure is:

Creates an S3 bucket to hold cached tiles
Configures an S3 Website to serve tiles from that bucket
Adds an origin to the CloudFront distribution to point to that S3 website
Configures a fallback behavior on the S3 website such that, when a tile is not found, the S3 website redirects back to the CloudFront distribution, adding a latest/ prefix to the request path
Adds a cache rule for the CloudFront distribution so that the API Gateway origin, which was formerly the only origin, now only handles requests that start with latest/
Adds code to api.js so that, if there is a CACHE_BUCKET configured in the environment, it writes all tiles to that bucket, using the request path as the key

So the effect is that the CloudFront distribution now serves tiles from S3 when they exist, with a seamless fallback to API Gateway/Lambda when they don't, and the Lambda function adds a tile to the cache the first time it's generated.

This is all great, but there's a fatal flaw: the S3 website uses the request path as the key but ignores the querystring. This means that all the configuration parameters we put in the querystring (layers/layers, filter/filters, utfFields, config, s3bucket) need to be converted to path parameters.

I will make issues for making that transition. In the meantime, S3 caching works to the extent that either 1) all the defaults are acceptable, so you can get the tiles you want with no querystring or 2) you're comfortable fudging it because you're confident the parameters given in the querystring won't change, so the fact that they're ignored for caching purposes won't cause situations where cached tiles don't match the provided parameters. I.e. it doesn't completely not work, but it's broken.

Note also: it works to write tiles to S3 in local development, but not to read them from there. Or at least, I doubt it's possible to get an S3 website redirecting to localhost, and I didn't try.

mattmeye · 2018-11-26T20:11:19Z

Thank you very much for this feature. Maybe you can generate a hash of all config parameters and store the tiles under the /hash/x/y/z.png in the bucket.

CloudNiner assigned KlaasH Nov 1, 2018

This was referenced Dec 3, 2018

Add filtering by job ID to Tilegarden endpoint azavea/pfb-network-connectivity#609

Closed

Enable Tilegarden tiler and add job_id filtering azavea/pfb-network-connectivity#613

Merged

This was referenced Dec 21, 2018

Add S3 tile caching setup to terraform config azavea/pfb-network-connectivity#634

Closed

Write tiles to S3 cache bucket azavea/pfb-network-connectivity#635

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Caching Feature #126

Caching Feature #126

mattmeye commented Sep 29, 2018

mattdelsordo commented Sep 29, 2018

KlaasH commented Oct 1, 2018

mattmeye commented Oct 2, 2018

KlaasH commented Nov 26, 2018

mattmeye commented Nov 26, 2018 via email •

edited

Loading

Caching Feature #126

Caching Feature #126

Comments

mattmeye commented Sep 29, 2018

mattdelsordo commented Sep 29, 2018

KlaasH commented Oct 1, 2018

mattmeye commented Oct 2, 2018

KlaasH commented Nov 26, 2018

mattmeye commented Nov 26, 2018 via email • edited Loading

mattmeye commented Nov 26, 2018 via email •

edited

Loading