This repository has been archived by the owner on Jul 16, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #42 from moka-guys/remove_osbolete_scripts
Remove obsolete scripts - backup runfolder now in automated_scripts, … (#42)
- Loading branch information
Showing
7 changed files
with
51 additions
and
579 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,2 @@ | ||
*.pyc | ||
wscleaner/wscleaner/config.json | ||
wscleaner/test/test_dir*.txt | ||
wscleaner/test/data |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,75 +1,70 @@ | ||
# Workstation Housekeeping v1.11 | ||
|
||
Scripts to manage data on the NGS workstation | ||
|
||
--- | ||
|
||
## backup_runfolder.py | ||
|
||
Uploads an Illumina runfolder to DNANexus. | ||
|
||
### Usage | ||
|
||
```bash | ||
backup_runfolder.py [-h] -i RUNFOLDER [-a AUTH_TOKEN] [--ignore IGNORE] [-p PROJECT] [--logpath LOGPATH] | ||
``` | ||
|
||
### What are the dependencies for this script? | ||
## Workstation Cleaner (wscleaner) | ||
|
||
This tool requires the DNAnexus utilities `ua` (upload agent) and `dx` (DNAnexus toolkit) to be available in the system PATH. Python3 is required, and this tool uses packages from the standard library. | ||
Workstation Cleaner (wscleaner) deletes local directories that have been uploaded to the DNAnexus cloud storage service. | ||
|
||
### How does this tool work? | ||
When executed, Runfolders in the input (root) directory are deleted based on the following criteria: | ||
|
||
* The script parses the input parameters, asserting that the given runfolder exists. | ||
* If the `-p` option is given, the script attempts to find a matching DNAnexus project. Otherwise, it looks for a single project matching the runfolder name. If more or less than 1 project matches, the script logs an error and exits. | ||
* The runfolder is traversed and a list of files in each folder is obtained. If any comma-separated strings passed to the `--ignore` argument are present within the filepath, or filename the file is excluded. | ||
* A single DNAnexus project is found matching the runfolder name | ||
* All local FASTQ files are uploaded and in a 'closed' state | ||
* X logfiles are present in the DNA Nexus project /Logfiles directory (NB X can be added as a command line argument - default is 5) | ||
|
||
* The DNAnexus `ua` utility is used to upload files in batches of 100 at a time. The number of upload tries is set to 100 with the `--tries` flag. | ||
* Orthogonal tests are performed to: | ||
* A count of files that should be uploaded (using the ignore terms if provided) | ||
* A count of files in the DNA Nexus project | ||
* (If relevant) A count of files in the DNA Nexus project containing a pattern to be ignored. NB this may not be accurate if the ignore term is found in the result of dx find data (eg present in project name) | ||
* Logs from this and the script are written to a logfile, named "runfolder_backup_runfolder.log". A destination for this file can be passed to the `--logpath` flag. | ||
or if the run is identified as a TSO500 run, based on: | ||
* the bcl2fastq2_output.log file created by the automated scripts | ||
AND | ||
* Presence of `_TSO` in the human readable DNANexus project name | ||
|
||
--- | ||
A DNAnexus API key must be cached locally using the `--set-key` option. | ||
|
||
## findfastqs.sh | ||
## Workstation Environment | ||
The directory `env/` in this repository contains conda environment scripts for the workstation. These remove conflicts in the PYTHONPATH environment variable by editing the variable when conda is activated. The conda documentation describes where to place these scripts under ['saving environment variables'](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#macos-and-linux). | ||
|
||
Report the number of gzipped fastq files in an Illumina runfolder. | ||
## Install | ||
As descibed above, on the workstation 2 environments exist - wscleaner and wscleaner_test (for development work). | ||
You need to activate these environment before installing with pip (as below). | ||
|
||
### Usage | ||
|
||
```bash | ||
$ findfastqs.sh RUNFOLDER | ||
>>> RUNFOLDER has 156 demultiplexed fastq files with 2 undetermined. Total: 158 | ||
git clone https://github.com/moka-guys/workstation_housekeeping.git | ||
pip install workstation_housekeeping/wscleaner | ||
wscleaner --version # Print version number | ||
``` | ||
|
||
--- | ||
|
||
## Workstation Cleaner (wscleaner) | ||
|
||
Delete local directories that have been uploaded to the DNAnexus cloud storage service. | ||
See wscleaner readme for more info | ||
|
||
## ngrok_start.sh | ||
|
||
Allow SSH access to the system by running ngrok as a background process. As of v1.11 supports dockerised ngrok instance. | ||
|
||
### Installation | ||
|
||
See knowledge base article for ngrok installation. | ||
## Automated usage | ||
The script `wscleaner_command.sh` is called by the crontab. This activates the enviroment and passes the logfile path (and any other non-default arguments). | ||
A development command script `wscleaner_command_dev.sh` can be used to call the test environment and provide testing arguments, eg --dry-run | ||
|
||
### Usage | ||
|
||
Non-dockerised ngrok: | ||
## Manual Usage | ||
|
||
`sudo bash ngrok_start.sh` | ||
|
||
Dockerised ngrok: | ||
``` | ||
usage: wscleaner [-h] [--auth AUTH] [--dry-run] [--logfile LOGFILE] | ||
[--min-age MIN_AGE] [--logfile-count LOGFILE_COUNT] | ||
[--version] | ||
root | ||
positional arguments: | ||
root A directory containing runfolders to process | ||
optional arguments: | ||
-h, --help show this help message and exit | ||
--auth AUTH A text file containing the DNANexus authentication | ||
token | ||
--dry-run Perform a dry run without deleting files | ||
--logfile LOGFILE A path for the application logfile | ||
--min-age MIN_AGE The age (days) a runfolder must be to be deleted | ||
--logfile-count LOGFILE_COUNT | ||
The number of logfiles a runfolder must have in | ||
/Logfiles | ||
--version Print version | ||
``` | ||
|
||
`sudo bash ngrok_start.sh docker` | ||
## Test | ||
|
||
### output | ||
```bash | ||
# Run from the cloned repo directory after installation | ||
pytest . --auth_token DNA_NEXUS_KEY | ||
``` | ||
|
||
The script will output the ngrok connection details | ||
## License | ||
|
||
Developed by Synnovis Genome Informatics |
Oops, something went wrong.