Skip to content

Commit

Permalink
feat: generate initial collection-related STAC metadata TDE-1300 (#1124)
Browse files Browse the repository at this point in the history
#### Motivation

Set up metadata that is required for the standardising workflow:
`linz-slug`: required for Basemaps URL creation, will also be used for
generating the ODR path.
`collection-id`: required for creating STAC metadata

#### Modification

Add a new command, `stac-setup` to create these parameter files for Argo
Workflows to use.

The `stac-setup` command is based on [the generate-path
command](https://github.com/linz/argo-tasks/tree/master/src/commands/generate-path#generate-path).
Once the `linz:slug` field has been populated throughout existing
collections, the `generate-path` command will be updated to use the slug
for ODR path generation.

#### Checklist

- [x] Tests updated
- [x] Docs updated
- [x] Issue linked in Title

---------

Co-authored-by: paulfouquet <[email protected]>
  • Loading branch information
amfage and paulfouquet authored Nov 6, 2024
1 parent 68940d5 commit e30dff9
Show file tree
Hide file tree
Showing 7 changed files with 504 additions and 15 deletions.
31 changes: 16 additions & 15 deletions COMMANDS.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,16 @@
| Command | Description |
| ----------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------- |
| [copy](./src/commands/copy/README.md) | Copy a manifest of files |
| [create-manifest](./src/commands/create-manifest/README.md) | Create a list of files to copy and pass as a manifest |
| [group](./src/commands/group/README.md) | group a array of inputs into a set |
| [lds-fetch-layer](./src/commands/lds-fetch-layer/README.md) | Download a LDS layer from the LDS Cache |
| [list](./src/commands/list/README.md) | List and group files into collections of tasks |
| [mapsheet-coverage](./src/commands/mapsheet-coverage/README.md) | Create a list of mapsheets needing to be created from a basemaps configuration |
| [stac-catalog](./src/commands/stac-catalog/README.md) | Construct STAC catalog |
| [stac-github-import](./src/commands/stac-github-import/README.md) | Format and push a STAC collection.json file and Argo Workflows parameters file to a GitHub repository |
| [stac-sync](./src/commands/stac-sync/README.md) | Sync STAC files |
| [stac-validate](./src/commands/stac-validate/README.md) | Validate STAC files |
| [tileindex-validate](./src/commands/tileindex-validate/README.md) | List input files and validate there are no duplicates. |
| [pretty-print](./src/commands/pretty-print/README.md) | Pretty print JSON files |
| [generate-path](./src/commands/generate-path/README.md) | Generate target path from collection metadata |
| Command | Description |
| ----------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------- |
| [copy](./src/commands/copy/README.md) | Copy a manifest of files |
| [create-manifest](./src/commands/create-manifest/README.md) | Create a list of files to copy and pass as a manifest |
| [group](./src/commands/group/README.md) | group a array of inputs into a set |
| [lds-fetch-layer](./src/commands/lds-fetch-layer/README.md) | Download a LDS layer from the LDS Cache |
| [list](./src/commands/list/README.md) | List and group files into collections of tasks |
| [mapsheet-coverage](./src/commands/mapsheet-coverage/README.md) | Create a list of mapsheets needing to be created from a basemaps configuration |
| [stac-catalog](./src/commands/stac-catalog/README.md) | Construct STAC catalog |
| [stac-github-import](./src/commands/stac-github-import/README.md) | Format and push a STAC collection.json file and Argo Workflows parameters file to a GitHub repository |
| [stac-setup](./src/commands/stac-setup/README.md) | Collection-related STAC metadata setup. Outputs collection-id and linz-slug files within /tmp/stac-setup/ |
| [stac-sync](./src/commands/stac-sync/README.md) | Sync STAC files |
| [stac-validate](./src/commands/stac-validate/README.md) | Validate STAC files |
| [tileindex-validate](./src/commands/tileindex-validate/README.md) | List input files and validate there are no duplicates. |
| [pretty-print](./src/commands/pretty-print/README.md) | Pretty print JSON files |
| [generate-path](./src/commands/generate-path/README.md) | Generate target path from collection metadata |
3 changes: 3 additions & 0 deletions src/commands/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ import { commandMapSheetCoverage } from './mapsheet-coverage/mapsheet.coverage.j
import { commandPrettyPrint } from './pretty-print/pretty.print.js';
import { commandStacCatalog } from './stac-catalog/stac.catalog.js';
import { commandStacGithubImport } from './stac-github-import/stac.github.import.js';
import { commandStacSetup } from './stac-setup/stac.setup.js';
import { commandStacSync } from './stac-sync/stac.sync.js';
import { commandStacValidate } from './stac-validate/stac.validate.js';
import { commandTileIndexValidate } from './tileindex-validate/tileindex.validate.js';
Expand All @@ -28,6 +29,7 @@ export const AllCommands = {
'mapsheet-coverage': commandMapSheetCoverage,
'stac-catalog': commandStacCatalog,
'stac-github-import': commandStacGithubImport,
'stac-setup': commandStacSetup,
'stac-sync': commandStacSync,
'stac-validate': commandStacValidate,
'tileindex-validate': commandTileIndexValidate,
Expand All @@ -36,6 +38,7 @@ export const AllCommands = {
cmds: {
catalog: commandStacCatalog,
'github-import': commandStacGithubImport,
setup: commandStacSetup,
sync: commandStacSync,
validate: commandStacValidate,
},
Expand Down
29 changes: 29 additions & 0 deletions src/commands/stac-setup/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# stac-setup

Collection-related STAC metadata setup. Outputs collection-id and linz-slug files within /tmp/stac-setup/

## Usage

stac-setup <options>

### Options

| Usage | Description | Options |
| ------------------------------ | ------------------------------------------ | -------------------------------- |
| --config <str> | Location of role configuration file | optional |
| --start-year <str> | Start year of survey capture | optional |
| --end-year <str> | End year of survey capture | optional |
| --gsd <str> | GSD of dataset | |
| --region <str> | Region of dataset | |
| --geographic-description <str> | Geographic description of dataset | |
| --geospatial-category <str> | Geospatial category of dataset | |
| --odr-url <str> | Open Data Registry URL of existing dataset | optional |
| --output <value> | Where to store output files | default: file:///tmp/stac-setup/ |

### Flags

| Usage | Description | Options |
| --------- | --------------- | ------- |
| --verbose | Verbose logging | |

<!-- This file has been autogenerated by src/readme/readme.generate.ts -->
49 changes: 49 additions & 0 deletions src/commands/stac-setup/__test__/sample.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
import { StacCollection } from 'stac-ts';

import { StacCollectionLinz } from '../stac.setup.js';

export const SampleCollection: StacCollection & StacCollectionLinz = {
type: 'Collection',
stac_version: '1.0.0',
id: '01HGF4RAQSM53Z26Y7C27T1GMB',
title: 'Palmerston North 0.3m Storm Satellite Imagery (2024) - Preview',
description:
'Satellite imagery within the Manawatū-Whanganui region captured in 2024, published as a record of the Storm event.',
license: 'CC-BY-4.0',
links: [
{ rel: 'self', href: './collection.json', type: 'application/json' },
{
rel: 'item',
href: './BA34_1000_3040.json',
type: 'application/json',
},
{
rel: 'item',
href: './BA34_1000_3041.json',
type: 'application/json',
},
],
providers: [
{ name: 'Aerial Surveys', roles: ['producer'] },
{ name: 'Aerial Surveys', roles: ['licensor'] },
{
name: 'Toitū Te Whenua Land Information New Zealand',
roles: ['host', 'processor'],
},
],
'linz:lifecycle': 'preview',
'linz:geospatial_category': 'urban-aerial-photos',
'linz:region': 'manawatu-whanganui',
'linz:security_classification': 'unclassified',
'linz:event_name': 'Storm',
'linz:geographic_description': 'Palmerston North',
'linz:slug': 'palmerston-north_2024_0.3m',
extent: {
spatial: {
bbox: [[175.4961876, -36.8000575, 175.5071491, -36.7933469]],
},
temporal: {
interval: [['2022-12-31T11:00:00Z', '2022-12-31T11:00:00Z']],
},
},
};
220 changes: 220 additions & 0 deletions src/commands/stac-setup/__test__/stac.setup.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,220 @@
import assert from 'node:assert';
import { afterEach, before, describe, it } from 'node:test';

import { fsa } from '@chunkd/fs';
import { FsMemory } from '@chunkd/source-memory';

import { commandStacSetup } from '../stac.setup.js';
import { formatDate, slugFromMetadata, SlugMetadata } from '../stac.setup.js';
import { SampleCollection } from './sample.js';

describe('stac-setup', () => {
const mem = new FsMemory();

before(() => {
fsa.register('memory://', mem);
});

afterEach(() => {
mem.files.clear();
});

it('should retrieve setup from collection', async () => {
const baseArgs = {
addDateInSlug: true,
odrUrl: 'memory://collection.json',
output: new URL('memory://tmp/stac-setup/'),
verbose: false,
startYear: '2013',
endYear: '2014',
gsd: '1',
region: 'gisborne',
geographicDescription: 'Wairoa',
geospatialCategory: 'dem',
config: undefined,
} as const;
await fsa.write('memory://collection.json', JSON.stringify(structuredClone(SampleCollection)));
await commandStacSetup.handler(baseArgs);

const files = await fsa.toArray(fsa.list('memory://tmp/stac-setup/'));
files.sort();
assert.deepStrictEqual(files, ['memory://tmp/stac-setup/collection-id', 'memory://tmp/stac-setup/linz-slug']);
const slug = await fsa.read('memory://tmp/stac-setup/linz-slug');
assert.strictEqual(slug.toString(), 'palmerston-north_2024_0.3m');
const collectionId = await fsa.read('memory://tmp/stac-setup/collection-id');
assert.strictEqual(collectionId.toString(), '01HGF4RAQSM53Z26Y7C27T1GMB');
});

it('should retrieve setup from args', async () => {
const baseArgs = {
addDateInSlug: true,
odrUrl: '',
output: new URL('memory://tmp/stac-setup/'),
verbose: false,
startYear: '2013',
endYear: '2014',
gsd: '1',
region: 'gisborne',
geographicDescription: 'Wairoa',
geospatialCategory: 'dem',
config: undefined,
} as const;
await commandStacSetup.handler(baseArgs);

const files = await fsa.toArray(fsa.list('memory://tmp/stac-setup/'));
files.sort();
assert.deepStrictEqual(files, ['memory://tmp/stac-setup/collection-id', 'memory://tmp/stac-setup/linz-slug']);
const slug = await fsa.read('memory://tmp/stac-setup/linz-slug');
assert.strictEqual(slug.toString(), 'wairoa_2013-2014');
const collectionId = await fsa.read('memory://tmp/stac-setup/collection-id');
assert.notStrictEqual(collectionId.toString(), '01HGF4RAQSM53Z26Y7C27T1GMB');
});

it('should not include the date in the slug', async () => {
const baseArgs = {
odrUrl: '',
output: new URL('memory://tmp/stac-setup/'),
verbose: false,
startYear: '',
endYear: '',
gsd: '10',
region: 'new-zealand',
geographicDescription: '',
geospatialCategory: 'dem',
config: undefined,
} as const;
await commandStacSetup.handler(baseArgs);

const files = await fsa.toArray(fsa.list('memory://tmp/stac-setup/'));
files.sort();
assert.deepStrictEqual(files, ['memory://tmp/stac-setup/collection-id', 'memory://tmp/stac-setup/linz-slug']);
const slug = await fsa.read('memory://tmp/stac-setup/linz-slug');
assert.strictEqual(slug.toString(), 'new-zealand');
});
});

describe('GenerateSlugImagery', () => {
it('Should match - urban with geographic description', () => {
const metadata: SlugMetadata = {
geospatialCategory: 'urban-aerial-photos',
geographicDescription: 'Napier',
region: 'hawkes-bay',
date: '2017-2018',
gsd: '0.05',
};
assert.equal(slugFromMetadata(metadata), 'napier_2017-2018_0.05m');
});
it('Should match - rural with geographic description', () => {
const metadata: SlugMetadata = {
geospatialCategory: 'rural-aerial-photos',
geographicDescription: 'North Island Weather Event',
region: 'hawkes-bay',
date: '2023',
gsd: '0.25',
};
assert.equal(slugFromMetadata(metadata), 'north-island-weather-event_2023_0.25m');
});
it('Should match - region as no optional metadata', () => {
const metadata: SlugMetadata = {
geospatialCategory: 'urban-aerial-photos',
geographicDescription: undefined,
region: 'auckland',
date: '2023',
gsd: '0.3',
};
assert.equal(slugFromMetadata(metadata), 'auckland_2023_0.3m');
});
});

describe('GenerateSlugElevation', () => {
it('Should match - dem (no optional metadata)', () => {
const metadata: SlugMetadata = {
geospatialCategory: 'dem',
geographicDescription: undefined,
region: 'auckland',
date: '2023',
gsd: '10',
};
assert.equal(slugFromMetadata(metadata), 'auckland_2023');
});
it('Should match - dsm (no optional metadata)', () => {
const metadata: SlugMetadata = {
geospatialCategory: 'dsm',
geographicDescription: undefined,
region: 'auckland',
date: '2023',
gsd: '10',
};
assert.equal(slugFromMetadata(metadata), 'auckland_2023');
});
});

describe('GenerateSlugSatelliteImagery', () => {
it('Should match - geographic description & event', () => {
const metadata: SlugMetadata = {
geospatialCategory: 'satellite-imagery',
geographicDescription: 'North Island Cyclone Gabrielle',
region: 'new-zealand',
date: '2023',
gsd: '0.5',
};
assert.equal(slugFromMetadata(metadata), 'north-island-cyclone-gabrielle_2023_0.5m');
});
});

describe('GenerateSlugHistoricImagery', () => {
it('Should error as historic imagery geospatial category is not supported', () => {
const metadata: SlugMetadata = {
geospatialCategory: 'scanned-aerial-photos',
geographicDescription: undefined,
region: 'wellington',
date: '1963',
gsd: '1',
};
assert.throws(() => {
slugFromMetadata(metadata);
}, Error('Historic Imagery scanned-aerial-photos is out of scope for automated slug generation.'));
});
});

describe('GenerateSlugUnknownGeospatialCategory', () => {
it('Should error as is not a matching geospatial category.', () => {
const metadata: SlugMetadata = {
geospatialCategory: 'scanned-aerial-imagery',
geographicDescription: undefined,
region: 'wellington',
date: '1963',
gsd: '1',
};
assert.throws(() => {
slugFromMetadata(metadata);
}, Error("Slug can't be generated from collection as no matching category: scanned-aerial-imagery."));
});
});

describe('GenerateSlugDemIgnoringDate', () => {
it('Should not include the date in the slug', () => {
const metadata: SlugMetadata = {
geospatialCategory: 'dem',
geographicDescription: 'new-zealand',
region: 'new-zealand',
date: '',
gsd: '10',
};
assert.equal(slugFromMetadata(metadata), 'new-zealand');
});
});

describe('formatDate', () => {
it('Should return date as single year', async () => {
const startYear = '2023';
const endYear = '2023';
assert.equal(formatDate(startYear, endYear), '2023');
});

it('Should return date as two years', async () => {
const startYear = '2023';
const endYear = '2024';
assert.equal(formatDate(startYear, endYear), '2023-2024');
});
});
9 changes: 9 additions & 0 deletions src/commands/stac-setup/category.constants.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
export const dataCategories = {
AERIAL_PHOTOS: 'aerial-photos',
SCANNED_AERIAL_PHOTOS: 'scanned-aerial-photos',
RURAL_AERIAL_PHOTOS: 'rural-aerial-photos',
SATELLITE_IMAGERY: 'satellite-imagery',
URBAN_AERIAL_PHOTOS: 'urban-aerial-photos',
DEM: 'dem',
DSM: 'dsm',
};
Loading

0 comments on commit e30dff9

Please sign in to comment.