-
-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Option to export and save images and attachments #27
Comments
The JSON for a PDF attachment looks like this: [
{
"id": "attHd9sanAvr8PwGk",
"url": "https://v5.airtableusercontent.com/v1/20/20/1694980800000/eql_yHjtzp6IHoic_KROvw/FjJhJto96zC-1yxspRtNNa_WjWcU8_L2xJDUJd65Hj530F56TDTkxrSUjaUfLfSibHmxEwOkf3rcsoANvGaO0pl2YVUb00YFOlSLhEq_5gA/zsiKz8BHmXq87lkH3o4j2AgV0NTuUsZIcyT5JduLMo0",
"filename": "ardata.pdf",
"size": 506898,
"type": "application/pdf",
"thumbnails": {
"small": {
"url": "https://v5.airtableusercontent.com/v1/20/20/1694980800000/BScJEWFsxF5MOcZJ_i8RsA/tPGX0GFgMelnlR88d41c6DJDl9pa9E3M5y3123HrDPrj_A4Kocl2StxwBLsb3dFSZrwIjcw5csCI0DQZ5d_5vA/zI1iLniIAYc-oZhHonurjZ7h2pu5vxlzbR1NE2DMTx0",
"width": 28,
"height": 36
},
"large": {
"url": "https://v5.airtableusercontent.com/v1/20/20/1694980800000/QMvnNnH-ftXXOeBRYJk4fQ/io3cJpspKnpka1q-fAhsEiJCwPmReOow7NeGPl96ogH5HFXtqxQvge85hcz3EIzPJ7rr0-uzdAUoDWfPi4a0mw/mpJk4qmSys6QE8IqSp4whgrc-52UiOR1DD9X7rIiTh8",
"width": 396,
"height": 512
}
}
}
] For an image, this: [
{
"id": "attwPzAABrX7mUpdl",
"width": 4032,
"height": 3024,
"url": "https://v5.airtableusercontent.com/v1/20/20/1694937600000/-Ex6TxOvcFa-QRIDtsSZwg/kZ2Id34JwYwhYwqWUbhlqMLntfqEbe8J95Gnb1eXswOMlCh8lvxIujNMOojCSCJ8l5ZMO1UKsPPNZSLusJHcja8tkPPBnrBtvEpfof_ti6c/2UH1gYdWPHDAKGY1qe97mUSxskh2Q2uPKh_79MRdXmI",
"filename": "2023-05-24 18.54.28.jpg",
"size": 5237935,
"type": "image/jpeg",
"thumbnails": {
"small": {
"url": "https://v5.airtableusercontent.com/v1/20/20/1694937600000/4vruo5I_9EIlxlN7KPLozg/TFof3GLJo3MMxVLeEYqYAL7-g8RHHpD0mn3KzJB-ffPvKV9KXg8wVk_2onpLARuugA87bCEwLdhqMthHFoFr4g/6dCCAksO2neaFKqtEu5Tws0-CRuXrm8OP21XlHWYTFM",
"width": 48,
"height": 36
},
"large": {
"url": "https://v5.airtableusercontent.com/v1/20/20/1694937600000/Sjt3pZJeeovYOiVtWEBuMQ/VJx0IHFStQ1CEvOKsaoeMixItvcw3gVQ-MxYDt60q98S9iAOiSxFmw47tUdd-ozsnPJ-bjOAJ8NxOb2eLcLdyQ/EKsuXWiXUzmiFedTTUmnx46s9JFnZXkq2UXcENSPyT8",
"width": 683,
"height": 512
},
"full": {
"url": "https://v5.airtableusercontent.com/v1/20/20/1694937600000/uQVk3ZjVPk42MXgkdZ3Cjw/vFhGv1qIgRKSfC8Dah5TRgUqNSFxGW0UUc82KhUFjIt62kyT8Qr173dK91Y7R_keQEohFm25jgdWODR4LvVKMg/4yrXgtkH8nwhC4EsQ0oaQxrseOIQowDsRtOt0HbIRFA",
"width": 3000,
"height": 3000
}
}
}
] |
What filename should I use? Two options: there's the Maybe the answer is to store them like this:
Or like this:
I'm leaning towards the latter - the airtable ID, then a hyphen, then the filename. |
Are attachments unique to their tables or bases? Probably, but I can't be 100% sure. |
Maybe the CLI grows a Or it could be a flag, so you just pass I think I like the latter option better. I'll build that first. |
I'm going to download the attachments as a second step, with a progress bar based on the total count of bytes to fetch. Should I fetch the thumbnails too? I think so - even though that will slightly throw off the progress bar since I won't have the file size of those (it's not in the JSON). I'll store those as I'll use |
Huh, no at least one of the PDF thumbnails turned out to be a PNG. I guess I should set the file extension based on the detected content type. |
One slightly tricky thing is that the YAML/JSON flat files may end up with URLs to attachments which don't directly correspond to the files on disk. Not sure the best way to fix that. The stored YAML looks like this: - Attachments:
- filename: 2022-07-26 11.13.58.jpg
height: 3024
id: atttqayAwklSIehzC
size: 5471198
thumbnails:
full:
height: 3000
url: https://v5.airtableusercontent.com/v1/20/20/1694930400000/LhLoyhSixZRtUMOIW4rDmQ/diboUo14Z7KGQWOvyTPvG11qQ4FtV-xtDLDMXdETthmjqdM1Aysp-Gwa1f9lvR8ojVSURy1pw1VaL9SEDLQWxQ/_30sml0SbUmdNgkW9moBIQX9xSBVGMG6iLbCvSwIhgE
width: 3000
large:
height: 512
url: https://v5.airtableusercontent.com/v1/20/20/1694930400000/_afO19TInFwIhO4-QVV-FQ/S-JgJs8MHCKUbV8scsZE4tZKwTrCl1yxq5KCFe0O97yDnSO-4aKqPKvDrv2NX0JRjeD4ecwhgnJJepTyF3BHQg/_G1mqoQr68W8-3QYo-9f2X2eiDzcDBumSDV_7DlerEM
width: 683
small:
height: 36
url: https://v5.airtableusercontent.com/v1/20/20/1694930400000/17No0cBf3trPUv_1wywegw/360f3uc0uxEj5BCu0pqb4l7RrR6LgYy4RDLImLyhCxgbFQ6g2Ztss-yU1ouxU5BqTpqFP5tI0OWM6PhtnZdlFw/XUp16QgEVA0RvjU94Lm3HZYre_l0a83WxzPAmead57o
width: 48
type: image/jpeg
url: https://v5.airtableusercontent.com/v1/20/20/1694930400000/uZehbrwpqTtVT3f01w7cyQ/z6MGnx2f-cmr8Mrkwnj9m74ImZUwBUDOkYLPf9aAWhAhpEa6JLh7ydwJU_aDvU7B0nyi0_kfP1NERTq0IQz3eOYgrjCJh-z593y8WAzLawo/WuDrxvXaLSmGtf44yeduCy3pbsuwc-YTsvKGEsXpaaE
width: 4032 I could rewrite this to insert |
So I think the full process is:
|
I don't particularly want to have to round-trip the YAML/JSON formats, so I think I'll keep all that data in memory, snapshot it to disk, then do the attachment downloads and snapshot each file to disk again with the extra fields after the attachments have been saved. |
Better option: do this on a per-table basis - so it loads a table into memory, snaps it to disk, saves any attachments, updates the snapshot on disk and then moves on to the next table. I still like the early snapshot because I like that the user won't lose data if they cancel the operation during a large attachment download. |
👋 Hi @simonw - thank you for this great repo. I was wondering if you'd be open to a PR for downloading attachments. I have a pretty simple MVP in this commit that works by downloading files after downloading all records. WDYT? It's slightly hacky in that it does not rely on |
https://support.airtable.com/docs/airtable-attachment-url-behavior
This tool should grow a feature to download and save attachments, since the URLs are no longer a useful long-term reference.
How long until they expire? The docs say:
The text was updated successfully, but these errors were encountered: