Pipeline hisat.py
Overview
This pipeline quantifies gene expression from FASTQ files using Hisat2.
Configuration
The pipeline requires a configured pipeline_hisat.yml file.
Default configuration files can be generated by executing:
python <srcdir>/pipeline_hisat.py config
Inputs
The pipeline requires the following inputs
samples.tsv: see Configuration files
libraries.tsv: see :doc: Configuration files<configuration>
txseq annotations: the location where the pipeline_ensembl.py was run to prepare the annotatations.
Hisat index: a hisat2 index built with pipeline_hisat_index.py.
Requirements
The following software is required:
Hisat2
Output files
The pipeline produces the following outputs:
bam files: these are found in the hisat.dir sub-folder and are named by sample_id.
Code
- txseq.pipeline_hisat.firstPass(infile, sentinel)
Run a first hisat pass to identify novel splice sites.
- txseq.pipeline_hisat.novelSpliceSites(infiles, sentinel)
Collect the novel splice sites into a single file.
- txseq.pipeline_hisat.secondPass(infile, sentinel)
Align reads using HISAT with known and novel junctions.