Tests

This project has two test suites: unit tests and integration tests.

Unit tests

These are located in Analysis-Driver/tests, and are run using Pytest:

$ path/to/python/bin/pip install -r Analysis-Driver/requirements.txt
$ path/to/python/bin/py.test Analysis-Driver/tests

Integration tests

These are end-to-end tests where the pipeline is run on test input data and observed outputs are compared with expected values. These are also run through the Pytest runner, however some additional infrastructure needs to be set up as well.

Testing environment

This should be identical to setting up a production environment, with the exception of the Lims. This should include:

input/output data directories for runs and samples
any desired logging and notifications
configurations for executor, Fastq-Filterer, ncbi_cache and pipeline QC
locations of third-party software dependencies
reference data

For more information, see Analysis-Driver/etc/example_analysisdriver.yaml

No Lims or relevant configurations are required because the integration tests patch all of the pipeline's interactions with egcg_core.clarity.

Test data

Output data will always depend on input data, so ideally we would have some standard input data for these tests. Unfortunately, our test data is too big to publish (even our trimmed-down run input data is over 6Gb). To generate test data, we recommend the following:

take a BCL dataset from a HiSeqX run and trim it down to contain only the first tile for each cycle/lane (see this gist)
run the pipeline's demultiplexing component on this data to generate fastqs
ensure that project, sample and run IDs in sample sheets and directory names match what the integration tests expect

Rest API

As with running in testing/production, the integration test requires an instance of our Rest API. However, since the Rest API is Dockerised, the tests simply need to be able to set up a Docker container. To make this possible, simply build the image on your system with the name egcg_reporting_app. The integration test defines setUp and tearDown methods which set up a Docker container, find its local IP address, patch egcg_core.rest_communication accordingly, and after the test, stop and remove the container.

Configuring the integration test

Now that the test's running environment is ready, we just need to configure the test itself with what data it should expect and how to notify its results:

# integration_test.yaml
notification:
    mailhost: <hostname>
    port: <post>
    sender: <sender_email>
    recipients: [<recipient_email>]
    email_template: /path/to/report.html

demultiplexing:
    md5s:
        sample_1_R1.fastq.gz: <md5sum>
        # ...
bcbio:
    md5s:
        filename.txt: <md5sum>
        # ...
    qc:
        rest_api_field: <bam_reads>
        # ...

For more information on writing the config, see references to integration_cfg in Analysis-Driver/integration_tests/integration_test.py.

The tests can now be run with:

PYTHONPATH=path/to/Analysis-Driver ANALYSISDRIVERCONFIG=path/to/analysisdriver.yaml INTEGRATIONCONFIG=path/to/integration_test.yaml path/to/python Analysis-Driver/integration_tests/integration_test.py

This will now patch the pipeline's Lims component, run each test, check for expected output data, and produce an email report:

Analysis Driver integration test

Pipeline test finished. Start time: 04/01/2017 12:23:16, finish time: 04/01/2017 16:05:53. Pytest output:
============================= test session starts ==============================
platform linux -- Python 3.4.4, pytest-3.0.5, py-1.4.31, pluggy-0.4.0
rootdir: integration_tests, inifile: 
collected 2 items

Analysis-Driver/integration_tests/integration_test.py ....

========================= 4 passed in 13356.41 seconds =========================

Future developments

We should make project, sample and run IDs configurable
We should also be able to configure the name of the Docker container (currently hard-coded to egcg_reporting_app)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly