Skip to main content

Genome Analysis Unit

About Salmon Scatter-Process-Gather Workflow …


Salmon Scatter-Process-Gather Workflow (salmon_spg_wf) is a DNAnexus applet that process a batch of pair-end FASTQ read files and runs Salmon to produce expression count files.

 

Required Input Files


  • FASTQ Gzip Compressed Paired-end Files – A batch of sample pair-end read files with the form sample_name_R1.fastq.gz_ and sample_name_R2.fastq.gz. Where, you substitute sample_name with a unique sample name containing alpha-numeric characters and no spaces.
  • Salmon Index tar.gz File – A Salmon Indexed genome files with the form genome_name_salmon_idx.tar.gz . Where, you substitute genome_name with a unique genome name containing alpha-numeric characters and no spaces. This is generated using salmon_indexer.

 

Output Files


  • Salmon Results Directory tar.gz File – A file of form sample_name_salmon.tar.gz. This is a directory that is tar.gz compressed and needs to be expanded using the command tar -xzf tarfile. These files are provided if you wish to do some custom analysis. Otherwise, it can be ignored.
  •  Salmon’s Quant.sf File – A file of form sample_name_quant.sf. This file contains counts.
  • Kallisto’s abundance.h5 File – A file of form sample_name_abundance.h5. This files is transformed from sample_name_quant.sf file into a Kallisto Hierarchical Data Format (HDF) file.

 

For NCI Members


 

Developed by GAU