.. _pipelines: Pipelines ############## Sequana ships many pipelines related to NGS. Some are very simple (fastqc, demultiplex, downsampling), others are real-life NGS pipelines used in production. .. warning:: Since v0.8.0, pipelines are now independent from **Sequana**. They must be installed separetely and their dependencies must also be installed by the user/developer. Quick Start =========== If **Sequana** is installed, installing a pipeline is straightforward. For example, to install the variant calling pipeline:: pip install sequana_variant_calling --upgrade Since version 0.8.1, you can check whether you have the required dependencies. If not, an error message will appear anyway:: sequana_variant_calling --deps Once those dependencies are available, you can run the pipeline:: sequana_variant_calling --help Overview ======== In **Sequana** parlance, a pipeline is an application based on Snakemake that consists of a Snakefile and a configuration file. Although since v0.8.0, we augmented a pipeline with other optional files such as a schema to check the config file, a logo, a dag image representing the pipeline, a requirements file with external dependencies and so on. All pipelines are based on Snakemake. For a tutorial, you can have a look at the Snakemake page or online-tutorials (e.g. http://slowkow.com/notes/snakemake-tutorial/). .. note:: **Pipeline naming convention** A pipeline is named **sequana_pipelines_name** where name is to be replaced by the pipeline name. The name can contain underscores. For instance, the **variant_calling** pipeline is called **sequana_pipelines_variant_calling**. Actually, we have aliases and pipeline have usually a short name where **_pipelines** is dropped. So you can refer to a pipeline as **sequana_pipelines_variant_calling** or **sequana_variant_calling**. The reason for having the long and short versions is to avoid conflict name with Sequana standalones. For instance, the **sequana_coverage** tool exists. It is a standalone that study the coverage on a unique sample. We also have a pipeline to analyse several samples in parallel. Therefore the sequana_pipeline_coverage pipeline has no alias. Future version will use the short version only. Installation ============ Given its name, and provided you have installed Sequana, you can install the pipeline **name** using:: pip install sequana_name where name is replaced by the pipeline name. For instance:: pip install sequana_fastqc Since, you want to be up-to-date, add the --upgrade argument:: pip install sequana_fastqc --upgrade Usage ====== Each pipeline is different. We recomment to look this complementary section :ref:`facilitator`. Generally speaking, the --help argument should be sufficient to run most of the pipelines:: sequana_name --help The input arguments --input-directory, --input-pattern and --input-readtag will help you selecting the input data for the pipeline. Then, you will have to introspect the help and the documentation of the pipelines. Each pipeline has its own repositoty and living documentaion, which are available in the link here below. List pipelines ============== This is a non-exhaustive list of pipelines .. toctree:: :maxdepth: 1 pipeline_demultiplex.rst pipeline_fastqc.rst pipeline_mapper.rst pipeline_ribofinder.rst pipeline_pacbio_qc.rst pipeline_quality_control.rst pipeline_rnaseq.rst pipeline_vc.rst Please see the https://github.com/sequana organisation to get the full list.