NGSPipeDb: NGS Pipelines & Databases

NGSPipeDb is an automated pipeline for parallel processing of huge next generation sequencing (NGS) data and database generation using snakemake workflow which allows for ease of use, optimal speed, and a highly modular code that can be further added onto and customized by experienced users.

Quick start

Required

Although included in this section are step-by-step instructions, it is assumed that the user has a basic understanding of the nix command line interface. Also, it would be better if the user has basic knowledge about snakemake, conda and best practice RNA sequence analysis, but it is not required. You can also find some easy-to-learn matierals in our "Learning materials" page, for example linux & shell and RNASeq background for beginers.

Install wget and git

To get some of the required software packages, we will use the command line tools called wget and git. wget is a popular tool for downloading things off of the internet. git is a distributed version control system which we will use to checkout the NGSPipeDb code.

Note

These tools are already pre-installed in most systems, but if you are unsure whether or not you have wget enter wget and if the return is wget: command not found, then you will have to install wget. Do likewise for git.

Install Miniconda3

NGSPipeDb relies on the conda package manager for installation and dependency resolution, so you will need to install conda first.

We will be using the Miniconda3 package management system (aka CONDA) to manage all of the software packages that NGSPipe is dependent on.

Use following commands to retrieve and then run the Minicoda3 installation script:

1.download miniconda3

Linux & WSL

wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh

MacOSX

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh

2.install miniconda3

Linux & WSL

bash Miniconda3-latest-Linux-x86_64.sh

MacOSX

bash Miniconda3-latest-MacOSX-x86_64.sh

Important

While running the installation script, follow the commands listed on screen, and press the enter key to scroll. Make sure to answer yes when asked if you want to prepend Miniconda3 to PATH. After that, close your terminal, open a new one and you should now have Conda working! You could, alternatively, run source ~/.bashrc to initiate conda.

3.Test if conda is ready to work by entering: conda update conda. Press y to confirm the conda updates.

4.Finally, conda install mamba -c conda-forge.

Note

Mamba is a reimplementation of the conda package manager in C++, the fast conda-alternative. Mamba is recommended but not necessary.

Info

You will only have to install Minicoda3 once. If you face any conda problem, please learn more about it.

Installation

Install from pipi

pip3 install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple ngspipedb

Install from conda
```
conda install
```

Command line interface

ngspipedb -h

Usage: ngspipedb [OPTIONS] COMMAND [ARGS]...

  ngspipedb is a snakemake-based tool for reproducible next generation
  sequencing (NGS) data analysis and interactive web application auto-build.

  Example:
  ngspipedb env create -n ngspipe-rnaseq-basic
  ngspipedb download -n ngspipe-rnaseq-basic -t testdata
  ngspipedb startproject myprojectname -n ngspipe-rnaseq-basic
  ngspipedb runpipe myprojectname ngspipe-rnaseq-basic --report -db
  ngspipedb rundb serve test_pipeline/ngspipe-rnaseq-basic/result_Sep-06-2021/ngsdb_code/manage.py -up 0.0.0.0:8000

  A more detailed tutorial of how to use this toolkit can be found here:
  https://xuanblo.github.io/NGSPipeDb/

Options:
  --version   Show the version and exit.
  -h, --help  Show this message and exit.

Commands:
  download      data retrive related commands
  env           ngspipedb environment related commands
  rundb         generate a database related commands
  runpipe       run ngspipe
  startproject  Creates a ngspipedb project directory structure for the given
                project name in the current directory or optionally in the
                given directory.

Usage example

run RNA-Seq analysis, generate report, and build RNA-Seq database

step1. download test data:

ngspipedb download -n ngspipe-rnaseq-basic -t testdata && tar -zxvf testdata-ngspipe-rnaseq-basic.tar.gz

step2. run rnaseq analysis on test data:

ngspipedb runpipe mouse_rnaseq_analysis -n ngspipe-rnaseq-basic --genomeFasta testdata-ngspipe-rnaseq-basic/genome/chr19.fa --genomeAnno testdata-ngspipe-rnaseq-basic/genome/GRCm38.83.chr19.gtf --samplefile testdata-ngspipe-rnaseq-basic/rawdata/sample.csv --conditionfile testdata-ngspipe-rnaseq-basic/rawdata/condition.csv --rawreadsdir testdata-ngspipe-rnaseq-basic/rawdata -j 10 --report -db

step3. start ngsdb server:

ngspipedb rundb serve -m mouse_rnaseq_analysis/result/ngsdb_code/manage.py -up 127.0.0.1:8000