Building transcriptome indexes ============================== Introduction ------------ Txseq uses `Hisat2 `_ for alignment-based gene expression quantification and `Salmon `_ for quasi-alignment based gene expression quantification. To use these tools, it is first necessary to build method-specific transcriptomes indexes. As with preparation of the sanitised genome sequences and gene annotations, it is recommended to build transcriptome indexes in a central location for use in multiple projects. Txseq has dedicated pipelines for building Salmon and Hisat2 indexes which can be used as follows. .. note:: If you are using the KIR BMRC workspace, Salmon and Hisat index built with txseq can be found in the "/well/kir/projects/mirror/txseq/" directory. Building a Salmon Index ----------------------- In a suitably named directory, obtain the :doc:`pipeline_salmon_index.py ` configuration file with the following command:: mkdir salmon.index.dir cd salmon.index.dir txseq salmon_index config emacs pipeline_salmon_index.yml After editing the yaml to point to the location of the "txseq.genome.fa.gz" and "txseq.transcript.fa.gz" files (see the "Genomes and Annotations" section), and configuring the parameters as desired, the pipeline can be executed with the following command:: txseq salmon_index make full -v5 -p20 Building a Hisat2 Index ----------------------- In a suitably named directory, obtain the :doc:`pipeline_hisat_index.py ` configuration file with the following command:: mkdir hisat.index.dir cd hisat.index.dir txseq hisat_index config emacs pipeline_hisat_index.yml After editing the yaml to point to the location of the "txseq.genome.fa.gz" and "txseq.geneset.gtf.gz" files (see the "Genomes and Annotations" section), and configuring the parameters as desired, the pipeline can be executed with the following command:: txseq hisat_index make full -v5 -p20