10. Rules

As of August 2017, Sequana has about 80 different rules. The list is available from the source code. We design our rules following some strict conventions as explained in the Developer guide section.

Rules are documented and we developed a Sphinx extension to automatically add their docstring in this documentation. For example, the documentation of the rule fastq_sampling looks like:

A sample from raw FastQ files

Required input:
  • __fastq_sampling__input_fastq: list of fastq.gz files
Required output:
  • __fastq_sampling__output_fastq: list of fastq.gz files
Required configuration:
fastq_sampling:
    N: # number of reads to select

Uses sequana_fastq_head utility.

In order to use a Sequana rule in your pipeline, add this code:

from sequana import snaketools as sm
include: sm.modules["fastq_sampling"]

This takes care of the physical location of the rule. Of course, you will then need to look at the documentation and define the required variables in your pipeline. For instance, in the example above, given the documentation, you will need to define those two variables:

__fastq_sampling_input_fastq
__fastq_sampling_output_fastq

and have a configuration file with:

fastq_sampling:
    N: 1000

Many rules are used inside the Sequana pipelines but not all. For instance, the codecs rules (e.g. gz_to_bzip) are used in standalones.

Please see the Pipelines section for other rule documentation (e.g. bwa, fastqc, …).