The Biostar Engine is a Python and Django based scientific data analysis oriented application server that can execute scripts over the web while providing a graphical user interface for selecting the parameters of these scripts. In addition the software has data storage and project management support, and may be used as web based file management software. An actively maintained deployment of the software can be accessed at:
- https://www.bioinformatics.recipes (may occasionally be offline during the beta-test)
The scripts that the software executes may be written in bash
, may be a Makefile
, may be R
commands in a file or just about any code that could be executed from command line.
We call the scripts that the engine can execute recipes. Recipes with bioinformatics focus are maintained separately in the biostar-recipes repository.
In summary the Biostar Engine is able to:
- Generate a graphical user interface for a script.
- Manage the input data needed for the script.
- Manage the resulting files generated by running the script.
In a nutshell, a recipe is created via an interface specification file and a script template. The site generates the web interface from the interface specification file. Users can make selections in the web interface, these selections are then passed down into the script.
The Biostar Engine was designed for processing large projects composed of tens/hundreds/thousands of files. We have made an effort to represent data in a simplified way. For example all files of a sequencing run may be represented as just one single data entry. Read more about the subject in the Concepts and definitions page.
More details on how the site works at:
This software should be considered a beta-test, work-in-progress. A deployment-ready release is planned in the first half of 2018.
Our installation instructions rely on conda though other alternatives are equally viable. Users may use virtualenv
, pipenv
, homebrew
, apt-get
etc, or they may opt to not using any environment management tool. We use conda primarily since it allows us to also manage bioinformatics tools.
conda create -y --name engine python=3.6
source activate engine
There are different repositories for the engine and the recipes.
# This repository contains the biostar-engine software that can run recipes.
git clone https://github.com/biostars/biostar-engine.git
# This repository stores the various data analysis recipes.
git clone https://github.com/biostars/biostar-recipes.git
To run the server you will need to install the dependencies:
# Switch to the biostar-engine directory.
cd biostar-engine
# Install server dependencies.
pip install -r conf/python_requirements.txt
At this point the installation is complete.
All commands run through make
. To initialize and run the test site use:
make reset serve
Visit http://localhost:8000 to see your site running.
The default admin email/password combination is: admin@localhost/password
. Use these to log into the test site as an admin user.
To run bioinformatics tools the environment that the jobs are run in needs to be set up appropriately. The instructions makes use of bioconda to install tools into the current environment. Make sure that you have enabled bioconda prior to running the following:
# Activate the environment.
source activate engine
# Switch to the engine directory.
cd biostar-recipes
# Install the conda dependencies.
conda install --file conf/conda_requirements.txt
# Add the recipes to the python path.
python setup.py develop
The Makefile included with the engine contains additional commands.
Test the software:
make test
Re-initialize the database:
make reset
Serve the current site:
make serve
Initialize the example recipes from the biostar-recipe
repository.
make recipes
Run all tests:
make test
The site is built with Django therefore the official Django documentation applies to maintaining and deploying the site:
A recipe submitted for execution is called a job.
When the job is run the recipe parameters are applied onto recipe template to produce the script that gets executed. This transformation takes place right before executing the job.
Jobs can be executed as commands. See the job
command for details:
python manage.py job --help
The command has number of parameters that facilitate job management and recipe development. For example:
python manage.py job --list
will list all the jobs in the system. Other flags that allow users to investigate and override the behaviors.
python manage.py job --id 4 --show_script
will print the script for job 4 that is to be executed to the command line. Other flags such as -use_template
and -use_json
allows users to override the data or template loaded into the job.
This can be useful when developing new recipes.
Another handy command:
python manage.py job --next
will execute the next queued job. The job runner may be run periodically with cron.
The Biostar Engine supports uwsgi
. When deployed through
uwsgi
jobs are queued and run automatically through the uwsgi
spooler. See the uwsgi
documentation for details on how to control and customize that process.
[uwsgi]: <https://uwsgi-docs.readthedocs.io/en/latest/
Bioinformatics related recipes are stored and distributed from a separate repository:
Note: The site is designed to execute scripts on a remote server. In addition the site allows users with moderator rights may change the content of these scripts.
It is extremely important to monitor, restrict and guard access to all accounts with moderator privileges!