Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(website): Column remapping when submitting metadata files #3478

Merged
merged 91 commits into from
Jan 22, 2025
Merged
Show file tree
Hide file tree
Changes from 90 commits
Commits
Show all changes
91 commits
Select commit Hold shift + click to select a range
62b3b1b
Added stubs etc
fhennig Dec 23, 2024
65ebb5c
Added ColumnMappingModal stub
fhennig Dec 23, 2024
63da2d2
preventDefault to prevent form submit
fhennig Dec 23, 2024
ce2bb90
add useEffect for loading
fhennig Dec 23, 2024
a19567d
Add input column loading
fhennig Dec 23, 2024
d741847
pull through metadata template fields
fhennig Dec 23, 2024
e40a110
Use table layout
fhennig Dec 23, 2024
0f21dfd
Add inputs
fhennig Dec 23, 2024
cdca763
Progress
fhennig Dec 23, 2024
2038e4c
Update button style
fhennig Dec 23, 2024
303e4ab
Add todo
fhennig Dec 23, 2024
4809038
fix stuff
fhennig Dec 24, 2024
946c9bf
fix check
fhennig Dec 24, 2024
ee765cc
Add TODO
fhennig Dec 24, 2024
73880b6
style
fhennig Dec 24, 2024
53ff781
pre-map columns if direct match
fhennig Dec 26, 2024
8704764
.
fhennig Dec 26, 2024
72ab607
Move ColumnMapping type
fhennig Dec 26, 2024
b6d0f9f
extract class
fhennig Dec 26, 2024
f2a03e4
Move the 'apply' function
fhennig Dec 26, 2024
4c8899a
Add docs
fhennig Dec 26, 2024
d68a629
Rename columns
fhennig Dec 31, 2024
9fa3954
Pull through display names
fhennig Dec 31, 2024
92d58d0
Added some tests for ColumnMapping
fhennig Jan 1, 2025
5ebf479
improve test
fhennig Jan 1, 2025
09450b1
test fix
fhennig Jan 7, 2025
a5410be
remove obsolete TODO
fhennig Jan 7, 2025
0ec7332
Add failing test
fhennig Jan 7, 2025
0c1a162
Fix test
fhennig Jan 7, 2025
55a4c93
Use string similarity
fhennig Jan 16, 2025
2cda025
Improve texts on buttons, add discard option
fhennig Jan 7, 2025
dd33395
Improve error handling
fhennig Jan 7, 2025
5ed0b13
Use ProcessedFile in form prop
fhennig Jan 8, 2025
d53a323
.
fhennig Jan 8, 2025
376585b
Add CompressedFile
fhennig Jan 8, 2025
3472f8d
column mapping should work on compressed files
fhennig Jan 8, 2025
0aa1a4b
Various fixes
fhennig Jan 8, 2025
b153f5b
Change direction in ColumnMapping
fhennig Jan 8, 2025
56bc110
Change direction
fhennig Jan 8, 2025
fa908c6
foo
fhennig Jan 8, 2025
ef8478c
Adjust threshold
fhennig Jan 8, 2025
40b1061
Fix Metadata upload file not being aligned with label
fhennig Jan 13, 2025
e44baa2
Add tooltip
fhennig Jan 13, 2025
aa40de4
UI fix
fhennig Jan 13, 2025
853744e
format
fhennig Jan 13, 2025
870a666
TODO
fhennig Jan 13, 2025
8599c82
Use Listbox to allow for more styling
fhennig Jan 14, 2025
2dbd4d5
Refine styling
fhennig Jan 14, 2025
b5a9915
Add sectioning into headers, tooltips
fhennig Jan 14, 2025
a5b4244
Some fixes
fhennig Jan 14, 2025
9f50f2a
refactoring
fhennig Jan 14, 2025
6eaebe5
Prevent mapping from being saved if a field is missing.
fhennig Jan 14, 2025
76a4a31
Prevent duplicate target columns in column mapping
fhennig Jan 14, 2025
147fbcb
Add test
fhennig Jan 14, 2025
6005adb
Update website/src/utils/groupFieldsByHeader.ts
fhennig Jan 15, 2025
378e699
.
fhennig Jan 15, 2025
47d900e
Refactor groupedInputFields
fhennig Jan 15, 2025
83de86e
Add some docs
fhennig Jan 15, 2025
c60f381
Add testing
fhennig Jan 15, 2025
dfbc50a
Fix warning
fhennig Jan 15, 2025
2aff3cf
trigger preview (empty commit)
fhennig Jan 16, 2025
983548b
Update website/src/components/Submission/DataUploadForm.tsx
fhennig Jan 16, 2025
85badfc
Inline string-similarity-js
fhennig Jan 16, 2025
03b0e5e
formatting
fhennig Jan 16, 2025
6a6443a
reset packages to base
fhennig Jan 16, 2025
6688cbd
layout improvements
fhennig Jan 16, 2025
a84dfe6
Delay showing tooltip to prevent flicker while scrolling
fhennig Jan 16, 2025
ccb76de
fixes
fhennig Jan 16, 2025
fb39204
Terminate last line with newline
fhennig Jan 16, 2025
f218190
Adress some review
fhennig Jan 17, 2025
be7b42d
Simplify ColumnMapping constructor
fhennig Jan 17, 2025
d8d1616
Rename parameter
fhennig Jan 17, 2025
e8f4f31
Always show tooltip
fhennig Jan 17, 2025
d87c684
use csv library and include multi line cell in test
fhennig Jan 17, 2025
a899740
more fixes
fhennig Jan 17, 2025
46435ce
Mark more fields as desired
fhennig Jan 17, 2025
dbc7bcd
highlight non-exact matches
fhennig Jan 17, 2025
97e9ecb
use papaparse
fhennig Jan 17, 2025
a86dc8f
keep order intact
fhennig Jan 17, 2025
b7ba80b
format
fhennig Jan 17, 2025
3dc3557
Remove outdated todo
fhennig Jan 20, 2025
a4cefd8
Remove duplicates
fhennig Jan 21, 2025
567e846
Fix compression issues with tsv
fhennig Jan 21, 2025
25d7f07
merge main
fhennig Jan 21, 2025
170dc65
foo
fhennig Jan 21, 2025
fb8221d
install papaparse types
fhennig Jan 21, 2025
fb12ffb
Merge branch 'main' into column-remapping
fhennig Jan 21, 2025
f884e1e
Close dialog on error
fhennig Jan 21, 2025
39489ae
Prevent tooltip opening
fhennig Jan 21, 2025
c3bba64
format
fhennig Jan 21, 2025
49108a0
remove TODO
fhennig Jan 22, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/src/content/docs/for-users/submit-sequences.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ Loculus expects:

- Sequence data in `fasta` format with a unique submissionID per sequence.
- Metadata in `tsv` format for each sequence. If you upload through the Website, you can also use Excel files (`xls` or `xlsx` format). If you need help formatting metadata, there is a metadata template for each organism on the submission page.
You can also map columns in your file to the expected upload column names by clicking the 'Add column mapping' button.

![Metadata template.](../../../assets/MetadataTemplate.png)

Expand Down
16 changes: 16 additions & 0 deletions kubernetes/loculus/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -486,6 +486,7 @@ defaultOrganismConfig: &defaultOrganismConfig
- Zaire
- name: geoLocAdmin1
displayName: Collection subdivision level 1
desired: true
generateIndex: true
autocomplete: true
initiallyVisible: true
Expand All @@ -494,11 +495,13 @@ defaultOrganismConfig: &defaultOrganismConfig
ingest: division
- name: geoLocAdmin2
displayName: Collection subdivision level 2
desired: true
generateIndex: true
autocomplete: true
header: Sample details
- name: geoLocCity
displayName: Collection city
desired: true
generateIndex: true
autocomplete: true
header: Sample details
Expand All @@ -511,13 +514,15 @@ defaultOrganismConfig: &defaultOrganismConfig
header: Sample details
- name: specimenCollectorSampleId
displayName: Isolate name
desired: true
header: Sample details
ingest: ncbiIsolateName
enableSubstringSearch: true
- name: authors
displayName: Authors
type: authors
header: Authors
desired: true
enableSubstringSearch: true
order: 40
truncateColumnDisplayTo: 25
Expand All @@ -529,6 +534,7 @@ defaultOrganismConfig: &defaultOrganismConfig
columnWidth: 140
- name: authorAffiliations
displayName: Author affiliations
desired: true
enableSubstringSearch: true
truncateColumnDisplayTo: 15
header: Authors
Expand Down Expand Up @@ -587,13 +593,15 @@ defaultOrganismConfig: &defaultOrganismConfig
oneHeader: true
- name: cultureId
displayName: Culture ID
desired: true
header: Sample details
- name: sampleReceivedDate
ontology_id: GENEPIO:0001177
definition: The date on which the sample was received by the laboratory.
guidance: Alternative if "sampleCollectionDate" is not available. Record the date the sample was received by the laboratory. Required granularity includes year, month and day. Before sharing this data, ensure this date is not considered identifiable information. If this date is considered identifiable, it is acceptable to add "jitter" to the received date by adding or subtracting calendar days. Do not change the received date in your original records. Alternatively, collection_date may be used as a substitute in the data you share. The date should be provided in ISO 8601 standard format "YYYY-MM-DD".
example: '2020-03-20'
displayName: Sample received date
desired: true
type: date
preprocessing:
function: parse_and_assert_past_date
Expand Down Expand Up @@ -659,13 +667,15 @@ defaultOrganismConfig: &defaultOrganismConfig
example: Swab [GENEPIO:0100027]
displayName: Collection device
header: Sampling
desired: true
- name: collectionMethod
ontology_id: GENEPIO:0001241
definition: The process used to collect the sample e.g. phlebotomy, necropsy.
guidance: 'Provide a descriptor if a collection method was used for sampling. Use the pick list provided in the template. If a desired term is missing from the pick list, use this look-up service to identify a standardized term: https://www.ebi.ac.uk/ols/ontologies/obi. If not applicable, leave blank.'
example: Bronchoalveolar lavage (BAL) [GENEPIO:0100032]
displayName: Collection method
header: Sampling
desired: true
- name: foodProduct
ontology_id: GENEPIO:0100444
definition: A material consumed and digested for nutritional value or enjoyment.
Expand Down Expand Up @@ -833,6 +843,7 @@ defaultOrganismConfig: &defaultOrganismConfig
inputs:
date: sequencingDate
header: Sequencing
desired: true
- name: ampliconPcrPrimerScheme
ontology_id: GENEPIO:0001456
definition: The specifications of the primers (primer sequences, binding positions, fragment size generated etc) used to generate the amplicons to be sequenced.
Expand All @@ -854,13 +865,15 @@ defaultOrganismConfig: &defaultOrganismConfig
example: Oxford Nanopore MinION [GENEPIO:0100142]
displayName: Sequencing instrument
header: Sequencing
desired: true
- name: sequencingProtocol
ontology_id: GENEPIO:0001454
definition: The protocol used to generate the sequence.
guidance: 'Provide a free text description of the methods and materials used to generate the sequence. Suggested text, fill in information where indicated.: "Viral sequencing was performed following a tiling amplicon strategy using the <fill in> primer scheme. Sequencing was performed using a <fill in> sequencing instrument. Libraries were prepared using <fill in> library kit. "'
example: Genomes were generated through amplicon sequencing of 1200 bp amplicons with Freed schema primers. Libraries were created using Illumina DNA Prep kits, and sequence data was produced using Miseq Micro v2 (500 cycles) sequencing kits.
displayName: Sequencing protocol
header: Sequencing
desired: true
- name: sequencingAssayType
ontology_id: GENEPIO:0100997
definition: The overarching sequencing methodology that was used to determine the sequence of a biomaterial.
Expand Down Expand Up @@ -932,6 +945,7 @@ defaultOrganismConfig: &defaultOrganismConfig
displayName: Depth of coverage
type: int
header: Sequencing
desired: true
- name: breadthOfCoverage
ontology_id: GENEPIO:0001475
definition: The threshold used as a cut-off for the depth of coverage.
Expand Down Expand Up @@ -1003,6 +1017,7 @@ defaultOrganismConfig: &defaultOrganismConfig
autocomplete: true
header: "Host"
ingest: ncbiHostName
desired: true
- name: hostNameCommon
generateIndex: true
autocomplete: true
Expand All @@ -1016,6 +1031,7 @@ defaultOrganismConfig: &defaultOrganismConfig
url: "https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=info&id=__value__"
header: "Host"
ingest: ncbiHostTaxId
desired: true
- name: isLabHost
type: boolean
autocomplete: true
Expand Down
18 changes: 18 additions & 0 deletions website/package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions website/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@
"luxon": "^3.5.0",
"neverthrow": "^8.1.1",
"openid-client": "^5.7.1",
"papaparse": "^5.5.1",
"react": "^18.3.1",
"react-chartjs-2": "^5.3.0",
"react-confirm-alert": "^3.0.6",
Expand Down Expand Up @@ -75,6 +76,7 @@
"@types/lodash": "^4.17.14",
"@types/luxon": "^3.4.2",
"@types/node": "^22.10.7",
"@types/papaparse": "^5.3.15",
"@types/react": "^18.3.12",
"@types/react-dom": "^18.3.1",
"@types/uuid": "^10.0.0",
Expand Down
4 changes: 2 additions & 2 deletions website/src/components/MetadataTable.astro
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
import { getConfiguredOrganisms, getSchema } from '../config';
import { getConfiguredOrganisms, getGroupedInputFields, getSchema } from '../config';
import OrganismTableSelector from './OrganismMetadataTableSelector';
import type { OrganismMetadata } from './OrganismMetadataTableSelector';

Expand All @@ -9,7 +9,7 @@ const organisms: OrganismMetadata[] = configuredOrganisms.map((organism) => {
key: organism.key,
displayName: organism.displayName,
metadata: getSchema(organism.key).metadata,
inputFields: getSchema(organism.key).inputFields,
groupedInputFields: getGroupedInputFields(organism.key, 'submit'),
};
});
---
Expand Down
7 changes: 3 additions & 4 deletions website/src/components/OrganismMetadataTableSelector.tsx
Original file line number Diff line number Diff line change
@@ -1,16 +1,15 @@
import React, { useState, useEffect } from 'react';
import { useState, useEffect } from 'react';
import type { FC } from 'react';

import { routes } from '../routes/routes.ts';
import type { Metadata, InputField } from '../types/config.ts';
import { groupFieldsByHeader } from '../utils/groupFieldsByHeader.ts';
import IwwaArrowDown from '~icons/iwwa/arrow-down';

export type OrganismMetadata = {
key: string;
displayName: string;
metadata: Metadata[];
inputFields: InputField[];
groupedInputFields: Map<string, InputField[]>;
};

type Props = {
Expand Down Expand Up @@ -40,7 +39,7 @@ const OrganismMetadataTableSelector: FC<Props> = ({ organisms }) => {

useEffect(() => {
if (selectedOrganism) {
setGroupedFields(groupFieldsByHeader(selectedOrganism.inputFields, selectedOrganism.metadata));
setGroupedFields(selectedOrganism.groupedInputFields);
}
}, [selectedOrganism]);

Expand Down
59 changes: 41 additions & 18 deletions website/src/components/Submission/DataUploadForm.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@ import { type FormEvent, useState } from 'react';

import { dataUploadDocsUrl } from './dataUploadDocsUrl.ts';
import { getClientLogger } from '../../clientLogger.ts';
import type { ColumnMapping } from './FileUpload/ColumnMapping.ts';
import { ColumnMappingModal } from './FileUpload/ColumnMappingModal.tsx';
import { UploadComponent } from './FileUpload/UploadComponent.tsx';
import DataUseTermsSelector from '../../components/DataUseTerms/DataUseTermsSelector';
import useClientFlag from '../../hooks/isClient.ts';
Expand All @@ -23,7 +25,8 @@ import { dateTimeInMonths } from '../../utils/DateTimeInMonths.tsx';
import { createAuthorizationHeader } from '../../utils/createAuthorizationHeader.ts';
import { stringifyMaybeAxiosError } from '../../utils/stringifyMaybeAxiosError.ts';
import { withQueryProvider } from '../common/withQueryProvider.tsx';
import { FASTA_FILE_KIND, METADATA_FILE_KIND } from './FileUpload/fileProcessing.ts';
import { FASTA_FILE_KIND, METADATA_FILE_KIND, type ProcessedFile, RawFile } from './FileUpload/fileProcessing.ts';
import type { InputField } from '../../types/config.ts';

export type UploadAction = 'submit' | 'revise';

Expand All @@ -34,6 +37,7 @@ type DataUploadFormProps = {
action: UploadAction;
group: Group;
referenceGenomeSequenceNames: ReferenceGenomesSequenceNames;
metadataTemplateFields: Map<string, InputField[]>;
onSuccess: () => void;
onError: (message: string) => void;
};
Expand Down Expand Up @@ -122,9 +126,12 @@ const InnerDataUploadForm = ({
onError,
group,
referenceGenomeSequenceNames,
metadataTemplateFields,
}: DataUploadFormProps) => {
const [metadataFile, setMetadataFile] = useState<File | null>(null);
const [sequenceFile, setSequenceFile] = useState<File | null>(null);
const [metadataFile, setMetadataFile] = useState<ProcessedFile | null>(null);
// The columnMapping can be null; if null -> don't apply mapping.
const [columnMapping, setColumnMapping] = useState<ColumnMapping | null>(null);
const [sequenceFile, setSequenceFile] = useState<ProcessedFile | null>(null);
const [exampleEntries, setExampleEntries] = useState<number | undefined>(10);

const { submit, revise, isLoading } = useSubmitFiles(accessToken, organism, clientConfig, onSuccess, onError);
Expand All @@ -145,11 +152,11 @@ const InnerDataUploadForm = ({
const metadataFile = createTempFile(exampleMetadataContent, 'text/tab-separated-values', 'metadata.tsv');
const sequenceFile = createTempFile(sequenceFileContent, 'application/octet-stream', 'sequences.fasta');

setMetadataFile(metadataFile);
setSequenceFile(sequenceFile);
setMetadataFile(new RawFile(metadataFile));
setSequenceFile(new RawFile(sequenceFile));
};

const handleSubmit = (event: FormEvent) => {
const handleSubmit = async (event: FormEvent) => {
event.preventDefault();

if (!agreedToINSDCUploadTerms) {
Expand All @@ -173,12 +180,18 @@ const InnerDataUploadForm = ({
return;
}

let finalMetadataFile = metadataFile.inner();

if (columnMapping !== null) {
finalMetadataFile = await columnMapping.applyTo(metadataFile);
}

switch (action) {
case 'submit': {
const groupId = group.groupId;
submit({
metadataFile,
sequenceFile,
metadataFile: finalMetadataFile,
sequenceFile: sequenceFile.inner(),
groupId,
dataUseTermsType,
restrictedUntil:
Expand All @@ -189,7 +202,7 @@ const InnerDataUploadForm = ({
break;
}
case 'revise':
revise({ metadataFile, sequenceFile });
revise({ metadataFile: finalMetadataFile, sequenceFile: sequenceFile.inner() });
break;
}
};
Expand Down Expand Up @@ -255,8 +268,8 @@ const InnerDataUploadForm = ({
<DevExampleData
setExampleEntries={setExampleEntries}
exampleEntries={exampleEntries}
metadataFile={metadataFile}
sequenceFile={sequenceFile}
metadataFile={metadataFile ? metadataFile.inner() : null}
sequenceFile={sequenceFile ? sequenceFile.inner() : null}
handleLoadExampleData={handleLoadExampleData}
/>
)}
Expand All @@ -274,12 +287,22 @@ const InnerDataUploadForm = ({
</div>
<div className='w-60 space-y-2'>
<label className='text-gray-900 font-medium text-sm block'>Metadata File</label>
<UploadComponent
setFile={setMetadataFile}
name='metadata_file'
ariaLabel='Metadata File'
fileKind={METADATA_FILE_KIND}
/>
<div className='flex flex-col items-center w-full'>
<UploadComponent
setFile={setMetadataFile}
name='metadata_file'
ariaLabel='Metadata File'
fileKind={METADATA_FILE_KIND}
/>
{metadataFile !== null && (
<ColumnMappingModal
inputFile={metadataFile}
columnMapping={columnMapping}
setColumnMapping={setColumnMapping}
groupedInputFields={metadataTemplateFields}
/>
)}
</div>
</div>
</div>
</form>
Expand Down Expand Up @@ -368,7 +391,7 @@ const InnerDataUploadForm = ({
name='submit'
type='submit'
className='rounded-md py-2 text-sm font-semibold shadow-sm focus-visible:outline focus-visible:outline-2 focus-visible:outline-offset-2 bg-primary-600 text-white hover:bg-primary-500'
onClick={handleSubmit}
onClick={(e) => void handleSubmit(e)}
disabled={isLoading || !isClient}
>
<div className={`absolute ml-1.5 inline-flex ${isLoading ? 'visible' : 'invisible'}`}>
Expand Down
Loading
Loading