16. Glossary

BAI

The index file for a file generated in the BAM format. (This is a non-standard file type.)

BAM

Binary version of the Sequence Alignment Map (SAM) format.

BED

Format that defines the data lines displayed in an annotation track.

DSRC

A compression tool dedicated to FastQ files

FASTA

FASTA-formatted sequence files contains either nucleic acid sequence (such as DNA) or protein sequence information. FASTA files store multiple sequences in a single file.

GFF

General Feature Format, used for describing genes and other features associated with DNA, RNA and Protein sequences.

JSON

A human-readable data serialization language commonly used in configuration files. See https://en.wikipedia.org/wiki/JSON

Module

A directory that contains a snakemake rule and an associated README file. This is especially relevant for the Sequana pipelines. See Developer guide.

SAM

Sequence Alignment Map is a generic nucleotide alignment format that describes the alignment of query sequences or sequencing reads to a reference sequence or assembly

Snakefile

A file that contains one or several Snakemake rules

VCF

Variant Call Format, for use with the variant calling pipeline

YAML

A human-readable data serialization language commonly used in configuration files. See https://en.wikipedia.org/wiki/YAML