Skip to content

Commit

Permalink
Documentation updates
Browse files Browse the repository at this point in the history
  • Loading branch information
jlumpe committed Sep 24, 2021
1 parent df2d7cd commit 46bd181
Show file tree
Hide file tree
Showing 5 changed files with 168 additions and 26 deletions.
152 changes: 131 additions & 21 deletions docs/source/cli.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
Command Line Interface
**********************


Root command group
==================

Expand All @@ -10,27 +11,29 @@ Root command group

gambit [OPTIONS] COMMAND [ARGS]...

Some top-level options are set at the root command group, and should be specified `before` the name
of the subcommand to run.

Options
-------

.. option:: -d DB_DIR

Path to directory containing GAMBIT database files. Must contain exactly one ``.db`` and one
``.h5`` file. Required by most subcommands. As an alternative you can specify the database
location with the :envvar:`GAMBIT_DB_PATH` environment variable.
.. option:: -d, --db DIR

Path to directory containing GAMBIT database files. Required by most subcommands.
As an alternative you can specify the database location with the :envvar:`GAMBIT_DB_PATH`
environment variable.

Environment
-----------
Environment variables
---------------------

.. envvar:: GAMBIT_DB_PATH

Alternative to :option:`-d` for specifying path to database.


Commands
========

Querying the database
=====================

query
-----
Expand All @@ -39,36 +42,143 @@ query

::

gambit query [OPTIONS] FILES...
gambit query [OPTIONS] GENOMES...


Predict taxonomy of microbial samples from genome sequences.

Files must contain assembled genome sequences, but may have multiple contigs.
``GENOMES`` must contain assembled genome sequences, but may have multiple contigs. Alternatively
a file containing pre-calculated signatures may be used with the ``--sigfile`` option. The
reference database must be specified from the root command group.


Options
.......

.. option:: -o, --output OUTFILE
.. option:: -o, --output FILE

File to write output to. If omitted will write to stdout.

.. option:: -s, --seqfmt {fasta}

Format of genome sequence files. Currently only FASTA is supported.

.. option:: -f, --outfmt {json|csv}
.. option:: -f, --outfmt {csv|json|archive}

Output format.
Results format (see next section).

.. option:: --sigfile FILE

Output Formats
==============
Path to file containing query signatures.

JSON
----

TODO
Result Formats
--------------

CSV
---
...

A .csv file with one row per query. Contains the following columns:

* ``query.name`` - Name of query.
* ``query.path`` - Path to query file, if any.
* ``predicted.name`` - Name of predicted taxon.
* ``predicted.rank`` - Rank of predicted taxon.
* ``predicted.ncbi_id`` - ID of taxon in NCBI taxonomy database.
* ``predicted.threshold`` - Distance threshold of predicted taxon.
* ``closest.distance`` - Distance to closest genome.
* ``closest.description`` - Description of closest genome.


JSON
....

A machine-readable format meant to be used in pipelines.

.. todo::
Document schema


Archive
.......

A more verbose JSON-based format used for testing and development.



Generating and inspecting k-mer signatures
==========================================

signatures info
---------------

.. program:: gambit signatures info

::

gambit signatures info [OPTIONS] FILE


Print information about a GAMBIT signatures file. Defaults to a basic human-readable format.


Options
.......

.. option:: -j, --json

Print information in JSON format. Includes more information than standard output.

.. option:: -p, --pretty

Prettify JSON output to make it more human-readable.

.. option:: -i, --ids

Print IDs of all signatures in file.


signatures create
-----------------

.. program:: gambit signatures create

::

gambit signatures create [OPTIONS] GENOMES

Calculate GAMBIT signatures of ``GENOMES`` and write to file.

The ``-k`` and ``--prefix`` options may be omitted if a reference database is specified through the
root command group, in which case the parameters of the database will be used.


Options
.......

.. option:: -o, --output FILE

Path to write file to (required).

.. option:: -k INTEGER

Length of k-mers to find (does not include length of prefix).

.. option:: -p, --prefix STRING

K-mer prefix to match, a non-empty string of DNA nucleotide codes.

.. option:: -s, --seqfmt {fasta}

Format of genome sequence files. Currently only FASTA is supported.

.. option:: -i, --ids FILE

File containing IDs to assign to signatures in file metadata. Should contain one ID per line.

.. option:: -m, --meta-json FILE

JSON file containing metadata to attach to file.

(Not yet implemented)
.. todo::
Document schema
7 changes: 6 additions & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@
'sphinx.ext.autodoc',
# 'sphinx.ext.doctest',
'sphinx.ext.intersphinx',
# 'sphinx.ext.todo',
'sphinx.ext.todo',
# 'sphinx.ext.coverage',
# 'sphinx.ext.mathjax',
# 'sphinx.ext.viewcode',
Expand All @@ -61,3 +61,8 @@
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']


# -- Additional options ------------------------------------------------------

todo_include_todos = True
7 changes: 5 additions & 2 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
@@ -1,11 +1,14 @@
GAMBIT Documentation
********************

Contents
========

.. toctree::
:maxdepth: 2
:hidden:
:maxdepth: 1

install
quickstart
cli
api/api

Expand Down
23 changes: 21 additions & 2 deletions docs/source/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,32 @@ Installation and Setup
======================


Install from bioconda
---------------------

The recommended way to install the tool is through the conda package manager (available
`here <https://docs.conda.io/en/latest/miniconda.html>`_)::

conda install -c bioconda hesslab-gambit


Install from source
-------------------

TODO
Installing from source requires the ``cython`` package as well as a C compiler be installed on your
system. Clone the repository and navigate to the directory, and then run::

pip install .

Or do an editable development install with::

pip install -e .


Database files
--------------

TODO
Download the following files and place them in a directory of your choice:

* `gambit-genomes-1.0b1-210719.db <https://storage.googleapis.com/hesslab-gambit-public/databases/refseq-curated/1.0-beta/gambit-genomes-1.0b1-210719.db>`_
* `gambit-signatures-1.0b1-210719.h5 <https://storage.googleapis.com/hesslab-gambit-public/databases/refseq-curated/1.0-beta/gambit-signatures-1.0b1-210719.h5>`_
5 changes: 5 additions & 0 deletions docs/source/quickstart.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Quick Start
***********

.. todo::
\

0 comments on commit 46bd181

Please sign in to comment.