NGSPipeDb: NGS Pipelines & Databases
NGSPipeDb is an automated pipeline for parallel processing of huge next generation sequencing (NGS) data and database generation using snakemake workflow which allows for ease of use, optimal speed, and a highly modular code that can be further added onto and customized by experienced users.
Quick start
Required
Although included in this section are step-by-step instructions, it is assumed that the user has a basic understanding of the nix command line interface. Also, it would be better if the user has basic knowledge about snakemake, conda and best practice RNA sequence analysis, but it is not required. You can also find some easy-to-learn matierals in our "Learning materials" page, for example linux & shell and RNASeq background for beginers.
To get some of the required software packages, we will use the command line tools called wget and git. wget is a popular tool for downloading things off of the internet. git is a distributed version control system which we will use to checkout the NGSPipeDb code.
Note
These tools are already pre-installed in most systems, but if you are unsure whether or not you have wget enter wget
and if the return is wget: command not found
, then you will have to install wget. Do likewise for git.
NGSPipeDb relies on the conda package manager for installation and dependency resolution, so you will need to install conda first.
We will be using the Miniconda3 package management system (aka CONDA) to manage all of the software packages that NGSPipe is dependent on.
Use following commands to retrieve and then run the Minicoda3 installation script:
1.download miniconda3
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
2.install miniconda3
bash Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-MacOSX-x86_64.sh
Important
While running the installation script, follow the commands listed on screen, and press the enter key to scroll. Make sure to answer yes
when asked if you want to prepend Miniconda3 to PATH. After that, close your terminal, open a new one and you should now have Conda working! You could, alternatively, run source ~/.bashrc
to initiate conda.
3.Test if conda is ready to work by entering: conda update conda
. Press y
to confirm the conda updates.
4.Finally, conda install mamba -c conda-forge
.
Note
Mamba is a reimplementation of the conda package manager in C++, the fast conda-alternative. Mamba is recommended but not necessary.
Info
You will only have to install Minicoda3 once. If you face any conda problem, please learn more about it.
Installation
-
Install from pipi
pip3 install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple ngspipedb
-
Install from conda
conda install
Command line interface
ngspipedb -h
Usage: ngspipedb [OPTIONS] COMMAND [ARGS]...
ngspipedb is a snakemake-based tool for reproducible next generation
sequencing (NGS) data analysis and interactive web application auto-build.
Example:
ngspipedb env create -n ngspipe-rnaseq-basic
ngspipedb download -n ngspipe-rnaseq-basic -t testdata
ngspipedb startproject myprojectname -n ngspipe-rnaseq-basic
ngspipedb runpipe myprojectname ngspipe-rnaseq-basic --report -db
ngspipedb rundb serve test_pipeline/ngspipe-rnaseq-basic/result_Sep-06-2021/ngsdb_code/manage.py -up 0.0.0.0:8000
A more detailed tutorial of how to use this toolkit can be found here:
https://xuanblo.github.io/NGSPipeDb/
Options:
--version Show the version and exit.
-h, --help Show this message and exit.
Commands:
download data retrive related commands
env ngspipedb environment related commands
rundb generate a database related commands
runpipe run ngspipe
startproject Creates a ngspipedb project directory structure for the given
project name in the current directory or optionally in the
given directory.
Usage example
run RNA-Seq analysis, generate report, and build RNA-Seq database
step1. download test data:
ngspipedb download -n ngspipe-rnaseq-basic -t testdata && tar -zxvf testdata-ngspipe-rnaseq-basic.tar.gz
step2. run rnaseq analysis on test data:
ngspipedb runpipe mouse_rnaseq_analysis -n ngspipe-rnaseq-basic --genomeFasta testdata-ngspipe-rnaseq-basic/genome/chr19.fa --genomeAnno testdata-ngspipe-rnaseq-basic/genome/GRCm38.83.chr19.gtf --samplefile testdata-ngspipe-rnaseq-basic/rawdata/sample.csv --conditionfile testdata-ngspipe-rnaseq-basic/rawdata/condition.csv --rawreadsdir testdata-ngspipe-rnaseq-basic/rawdata -j 10 --report -db
step3. start ngsdb server:
ngspipedb rundb serve -m mouse_rnaseq_analysis/result/ngsdb_code/manage.py -up 127.0.0.1:8000