Pipeline fastqc.py

Author:

Stephen Sansom

Release:

$Id$

Date:

Apr 28, 2026

Tags:

Python

Overview

The pipeline runs the FASTQC quality control tool and post-processes the output for downstream-visualisation.

Configuration

The pipeline requires a configured pipeline_fastqc.yml file.

A default configuration file can be generated by executing:

txseq fastqc config

Inputs

The pipeline requires the following inputs

  1. samples.tsv: see Configuration files

  2. libraries.tsv: see :doc: Configuration files<configuration>

Requirements

The following software is required:

  1. FastQC

Output files

The pipeline produces the following outputs:

  1. fastqc results: for each FASTQ file in the “fastqc.dir” sub-folder

  2. An sqlite database: in a file named “csvdb” which contain summary tables of the fastqc results e.g. for plotting in R.

Note

For quick visualations of the fastqc results it is recommended to use MultiQC.

Code

txseq.pipeline_fastqc.buildFastQCSummaryStatus(infiles, outfile)

load FastQC status summaries into a single table.

txseq.pipeline_fastqc.loadFastQC(infile, outfile)

load FASTQC stats into database.

txseq.pipeline_fastqc.loadMetadata(infile, outfile)

load the sample and fastq table into the database