Skip to content

Commit

Permalink
Merge branch 'main' into 3388-allow-organisms-without-sequences
Browse files Browse the repository at this point in the history
  • Loading branch information
fengelniederhammer committed Jan 22, 2025
2 parents 4531b93 + e535ff7 commit a6bdac8
Show file tree
Hide file tree
Showing 35 changed files with 1,282 additions and 993 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Loculus targets any group managing sequencing data. It can be used by small publ

## Current state and roadmap

The Loculus software is already in a stable stage and used for production systems (see "Known instances" below) and you are welcome to explore this repository and try it out. However, please note that we are planning to revise the configuration files and the APIs before we release the official 1.0. Further, the documentation is so far quite sparse. We plan to release 1.0 with stable APIs and comprehensive documentation by the end of 2024.
The Loculus software is already in a stable stage and used for production systems (see "Known instances" below) and you are welcome to explore this repository and try it out. However, please note that we are planning to revise the configuration files and the APIs before we release the official 1.0. Further, the documentation is so far quite sparse. We plan to release 1.0 with stable APIs and comprehensive documentation in the coming months.

If you are looking for a software to manage sequencing data and would like to know whether Loculus might be a suitable tool for you, please feel free to reach out. We would love to hear about your project and take your needs and requirements into consideration when we plan the further development.

Expand Down
14 changes: 7 additions & 7 deletions backend/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -35,16 +35,16 @@ dependencies {
implementation "org.jetbrains.kotlin:kotlin-reflect"
implementation "io.github.microutils:kotlin-logging-jvm:3.0.5"
implementation "org.postgresql:postgresql:42.7.5"
implementation "org.apache.commons:commons-csv:1.12.0"
implementation "org.apache.commons:commons-csv:1.13.0"
implementation "org.springdoc:springdoc-openapi-starter-webmvc-ui:2.8.3"
implementation "org.flywaydb:flyway-database-postgresql:11.1.1"
implementation "org.jetbrains.exposed:exposed-spring-boot-starter:0.57.0"
implementation "org.jetbrains.exposed:exposed-jdbc:0.57.0"
implementation "org.jetbrains.exposed:exposed-json:0.57.0"
implementation "org.jetbrains.exposed:exposed-kotlin-datetime:0.57.0"
implementation "org.flywaydb:flyway-database-postgresql:11.2.0"
implementation "org.jetbrains.exposed:exposed-spring-boot-starter:0.58.0"
implementation "org.jetbrains.exposed:exposed-jdbc:0.58.0"
implementation "org.jetbrains.exposed:exposed-json:0.58.0"
implementation "org.jetbrains.exposed:exposed-kotlin-datetime:0.58.0"
implementation "org.jetbrains.kotlinx:kotlinx-datetime:0.6.1"
implementation "org.hibernate.validator:hibernate-validator:8.0.2.Final"
implementation "org.keycloak:keycloak-admin-client:23.0.7"
implementation "org.keycloak:keycloak-admin-client:26.0.4"

implementation "org.springframework.boot:spring-boot-starter-oauth2-resource-server"
implementation "org.springframework.boot:spring-boot-starter-security"
Expand Down
13 changes: 7 additions & 6 deletions docs/package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions docs/src/content/docs/for-users/submit-sequences.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ Loculus expects:

- Sequence data in `fasta` format with a unique submissionID per sequence.
- Metadata in `tsv` format for each sequence. If you upload through the Website, you can also use Excel files (`xls` or `xlsx` format). If you need help formatting metadata, there is a metadata template for each organism on the submission page.
You can also map columns in your file to the expected upload column names by clicking the 'Add column mapping' button.

![Metadata template.](../../../assets/MetadataTemplate.png)

Expand Down
2 changes: 1 addition & 1 deletion ena-submission/test/approved_ena_submission_list_test.json
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@
"hostHealthState": "Hospital care required",
"ncbiReleaseDate": null,
"ncbiVirusTaxId": null,
"sraRunAccession": null,
"insdcRawReadsAccession": null,
"environmentalSite": null,
"signsAndSymptoms": null,
"anatomicalMaterial": null,
Expand Down
2 changes: 1 addition & 1 deletion ena-submission/test/external_metadata_test.ndjson
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"accession": "LOC1", "version": 1, "externalMetadata": {"ncbiReleaseDate": null,"ncbiUpdateDate": null,"ncbiSubmitterCountry": null,"insdcAccessionBase": null,"insdcVersion": null,"insdcAccessionFull": null,"bioprojectAccession": null,"biosampleAccession": null,"ncbiSourceDb": null,"ncbiVirusName": null,"ncbiVirusTaxId": null,"sraRunAccession": null}}
{"accession": "LOC1", "version": 1, "externalMetadata": {"ncbiReleaseDate": null,"ncbiUpdateDate": null,"ncbiSubmitterCountry": null,"insdcAccessionBase": null,"insdcVersion": null,"insdcAccessionFull": null,"bioprojectAccession": null,"biosampleAccession": null,"ncbiSourceDb": null,"ncbiVirusName": null,"ncbiVirusTaxId": null,"insdcRawReadsAccession": null}}
4 changes: 2 additions & 2 deletions ingest/tests/config_cchf/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ insdc_segment_specific_fields:
- insdcAccessionBase
- insdcVersion
- insdcAccessionFull
- sraRunAccession
- insdcRawReadsAccession
keycloak_token_url: http://localhost:8083/realms/loculus/protocol/openid-connect/token
nextclade_dataset_name: nextstrain/cchfv/linked
nextclade_dataset_server: https://raw.githubusercontent.com/nextstrain/nextclade_data/cornelius-cchfv/data_output
Expand All @@ -23,7 +23,7 @@ rename:
ncbiHostTaxId: hostTaxonId
ncbiIsLabHost: isLabHost
ncbiIsolateName: specimenCollectorSampleId
ncbiSraAccessions: sraRunAccession
ncbiSraAccessions: insdcRawReadsAccession
ncbiSubmitterAffiliation: authorAffiliations
ncbiSubmitterNames: authors
taxon_id: 3052518
Expand Down
12 changes: 6 additions & 6 deletions ingest/tests/expected_output_cchf/metadata_post_prepare.json
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
"sampleCollectionDate": "1989",
"segment": "L",
"specimenCollectorSampleId": "K229_194",
"sraRunAccession": "",
"insdcRawReadsAccession": "",
"submissionId": "KX013462.1"
},
"KX013463.1": {
Expand All @@ -46,7 +46,7 @@
"sampleCollectionDate": "1989",
"segment": "M",
"specimenCollectorSampleId": "K229_194",
"sraRunAccession": "",
"insdcRawReadsAccession": "",
"submissionId": "KX013463.1"
},
"KX013464.1": {
Expand All @@ -71,7 +71,7 @@
"sampleCollectionDate": "1989",
"segment": "S",
"specimenCollectorSampleId": "K229_194",
"sraRunAccession": "",
"insdcRawReadsAccession": "",
"submissionId": "KX013464.1"
},
"KX013483.1": {
Expand All @@ -96,7 +96,7 @@
"sampleCollectionDate": "1958",
"segment": "L",
"specimenCollectorSampleId": "Nakiwogo",
"sraRunAccession": "",
"insdcRawReadsAccession": "",
"submissionId": "KX013483.1"
},
"KX013485.1": {
Expand All @@ -121,7 +121,7 @@
"sampleCollectionDate": "1958",
"segment": "S",
"specimenCollectorSampleId": "Nakiwogo",
"sraRunAccession": "",
"insdcRawReadsAccession": "",
"submissionId": "KX013485.1"
},
"KX096703.1": {
Expand All @@ -146,7 +146,7 @@
"sampleCollectionDate": "2015",
"segment": "S",
"specimenCollectorSampleId": "tick pool #134",
"sraRunAccession": "",
"insdcRawReadsAccession": "",
"submissionId": "KX096703.1"
}
}
22 changes: 19 additions & 3 deletions kubernetes/loculus/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -486,6 +486,7 @@ defaultOrganismConfig: &defaultOrganismConfig
- Zaire
- name: geoLocAdmin1
displayName: Collection subdivision level 1
desired: true
generateIndex: true
autocomplete: true
initiallyVisible: true
Expand All @@ -494,11 +495,13 @@ defaultOrganismConfig: &defaultOrganismConfig
ingest: division
- name: geoLocAdmin2
displayName: Collection subdivision level 2
desired: true
generateIndex: true
autocomplete: true
header: Sample details
- name: geoLocCity
displayName: Collection city
desired: true
generateIndex: true
autocomplete: true
header: Sample details
Expand All @@ -511,13 +514,15 @@ defaultOrganismConfig: &defaultOrganismConfig
header: Sample details
- name: specimenCollectorSampleId
displayName: Isolate name
desired: true
header: Sample details
ingest: ncbiIsolateName
enableSubstringSearch: true
- name: authors
displayName: Authors
type: authors
header: Authors
desired: true
enableSubstringSearch: true
order: 40
truncateColumnDisplayTo: 25
Expand All @@ -529,6 +534,7 @@ defaultOrganismConfig: &defaultOrganismConfig
columnWidth: 140
- name: authorAffiliations
displayName: Author affiliations
desired: true
enableSubstringSearch: true
truncateColumnDisplayTo: 15
header: Authors
Expand Down Expand Up @@ -587,13 +593,15 @@ defaultOrganismConfig: &defaultOrganismConfig
oneHeader: true
- name: cultureId
displayName: Culture ID
desired: true
header: Sample details
- name: sampleReceivedDate
ontology_id: GENEPIO:0001177
definition: The date on which the sample was received by the laboratory.
guidance: Alternative if "sampleCollectionDate" is not available. Record the date the sample was received by the laboratory. Required granularity includes year, month and day. Before sharing this data, ensure this date is not considered identifiable information. If this date is considered identifiable, it is acceptable to add "jitter" to the received date by adding or subtracting calendar days. Do not change the received date in your original records. Alternatively, collection_date may be used as a substitute in the data you share. The date should be provided in ISO 8601 standard format "YYYY-MM-DD".
example: '2020-03-20'
displayName: Sample received date
desired: true
type: date
preprocessing:
function: parse_and_assert_past_date
Expand Down Expand Up @@ -659,13 +667,15 @@ defaultOrganismConfig: &defaultOrganismConfig
example: Swab [GENEPIO:0100027]
displayName: Collection device
header: Sampling
desired: true
- name: collectionMethod
ontology_id: GENEPIO:0001241
definition: The process used to collect the sample e.g. phlebotomy, necropsy.
guidance: 'Provide a descriptor if a collection method was used for sampling. Use the pick list provided in the template. If a desired term is missing from the pick list, use this look-up service to identify a standardized term: https://www.ebi.ac.uk/ols/ontologies/obi. If not applicable, leave blank.'
example: Bronchoalveolar lavage (BAL) [GENEPIO:0100032]
displayName: Collection method
header: Sampling
desired: true
- name: foodProduct
ontology_id: GENEPIO:0100444
definition: A material consumed and digested for nutritional value or enjoyment.
Expand Down Expand Up @@ -833,6 +843,7 @@ defaultOrganismConfig: &defaultOrganismConfig
inputs:
date: sequencingDate
header: Sequencing
desired: true
- name: ampliconPcrPrimerScheme
ontology_id: GENEPIO:0001456
definition: The specifications of the primers (primer sequences, binding positions, fragment size generated etc) used to generate the amplicons to be sequenced.
Expand All @@ -854,13 +865,15 @@ defaultOrganismConfig: &defaultOrganismConfig
example: Oxford Nanopore MinION [GENEPIO:0100142]
displayName: Sequencing instrument
header: Sequencing
desired: true
- name: sequencingProtocol
ontology_id: GENEPIO:0001454
definition: The protocol used to generate the sequence.
guidance: 'Provide a free text description of the methods and materials used to generate the sequence. Suggested text, fill in information where indicated.: "Viral sequencing was performed following a tiling amplicon strategy using the <fill in> primer scheme. Sequencing was performed using a <fill in> sequencing instrument. Libraries were prepared using <fill in> library kit. "'
example: Genomes were generated through amplicon sequencing of 1200 bp amplicons with Freed schema primers. Libraries were created using Illumina DNA Prep kits, and sequence data was produced using Miseq Micro v2 (500 cycles) sequencing kits.
displayName: Sequencing protocol
header: Sequencing
desired: true
- name: sequencingAssayType
ontology_id: GENEPIO:0100997
definition: The overarching sequencing methodology that was used to determine the sequence of a biomaterial.
Expand Down Expand Up @@ -932,6 +945,7 @@ defaultOrganismConfig: &defaultOrganismConfig
displayName: Depth of coverage
type: int
header: Sequencing
desired: true
- name: breadthOfCoverage
ontology_id: GENEPIO:0001475
definition: The threshold used as a cut-off for the depth of coverage.
Expand Down Expand Up @@ -1003,6 +1017,7 @@ defaultOrganismConfig: &defaultOrganismConfig
autocomplete: true
header: "Host"
ingest: ncbiHostName
desired: true
- name: hostNameCommon
generateIndex: true
autocomplete: true
Expand All @@ -1016,6 +1031,7 @@ defaultOrganismConfig: &defaultOrganismConfig
url: "https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=info&id=__value__"
header: "Host"
ingest: ncbiHostTaxId
desired: true
- name: isLabHost
type: boolean
autocomplete: true
Expand Down Expand Up @@ -1053,8 +1069,8 @@ defaultOrganismConfig: &defaultOrganismConfig
hideOnSequenceDetailsPage: true
noInput: true
header: "INSDC"
- name: sraRunAccession
displayName: SRA run accession
- name: insdcRawReadsAccession
displayName: Raw reads accession
customDisplay:
type: link
url: "https://www.ncbi.nlm.nih.gov/sra/?term=__value__"
Expand Down Expand Up @@ -1596,7 +1612,7 @@ welcomeMessageHTML: null
additionalHeadHTML: ""
images:
lapisSilo: "ghcr.io/genspectrum/lapis-silo:0.5.0"
lapis: "ghcr.io/genspectrum/lapis:0.3.10"
lapis: "ghcr.io/genspectrum/lapis:0.3.11"
secrets:
smtp-password:
type: raw
Expand Down
Loading

0 comments on commit a6bdac8

Please sign in to comment.