View on GitHub


Sargasso disambiguates mixed-species RNA-seq data.

Sargasso* is a tool to separate mixed-species RNA-seq reads according to their species of origin.

RNA-sequencing has become an important technique in cellular biology for characterising and quantifying the transcriptomes of particular species, and for analysing the differential expression of genes and transcripts. However, a number of recent experimental techniques produce RNA-seq data originating from a mixture of species. Previously, researchers have developed ad-hoc solutions to separate such RNA-seq reads into species-specific sets; Sargasso is an efficient, reliable tool to perform this task, achieving high specificity and sensitivity even for closely-related species, while requiring minimal setup and intervention by the user.

Given an input set of FASTQ files for a number of samples, each of which contain mixed-species RNA-seq read data, Sargasso separates reads according to their true species of origin, and outputs per-sample BAM files describing the mapping of these reads to the respective genomes. While the tool allows the user fine control at each stage, in normal usage the whole pipeline can be executed automatically with a single command.

For further details, please see below:

  1. Installation
  2. Example usage
  3. Pipeline description
  4. Usage reference
  5. Support scripts
  6. References

* Sargasso Assigns Reads to Genomes According to Species-Specific Origin