Pipeline fastqc.py
- Author:
Stephen Sansom
- Release:
$Id$
- Date:
Apr 28, 2026
- Tags:
Python
Overview
The pipeline runs the FASTQC quality control tool and post-processes the output for downstream-visualisation.
Configuration
The pipeline requires a configured pipeline_fastqc.yml file.
A default configuration file can be generated by executing:
txseq fastqc config
Inputs
The pipeline requires the following inputs
samples.tsv: see Configuration files
libraries.tsv: see :doc: Configuration files<configuration>
Requirements
The following software is required:
FastQC
Output files
The pipeline produces the following outputs:
fastqc results: for each FASTQ file in the “fastqc.dir” sub-folder
An sqlite database: in a file named “csvdb” which contain summary tables of the fastqc results e.g. for plotting in R.
Note
For quick visualations of the fastqc results it is recommended to use MultiQC.
Code
- txseq.pipeline_fastqc.buildFastQCSummaryStatus(infiles, outfile)
load FastQC status summaries into a single table.
- txseq.pipeline_fastqc.loadFastQC(infile, outfile)
load FASTQC stats into database.
- txseq.pipeline_fastqc.loadMetadata(infile, outfile)
load the sample and fastq table into the database