-
Notifications
You must be signed in to change notification settings - Fork 7
Description
When processing SRA RNA-seq fastq files with Fastool as part of the Trinity package, Fastool appends a /H to the end of sequence ids which then causes errors downstream in Trinity.
Here are the first few lines of an SRA file: https://gist.github.com/tsackton/8c5508a4b60a1e33f6f2
When I run: fastool --to-fasta --illumina-trinity sra_test.fq > sra_test.1.fa , the output headers look like this:
SRR488565.1/H
SRR488565.2/H
SRR488565.3/H
SRR488565.4/H
SRR488565.5/H
SRR488565.6/H
If I remove everything after the first space in the sra example (with seqtk seq -C), the output is normal:
SRR488565.1
SRR488565.2
SRR488565.3
SRR488565.4
SRR488565.5
SRR488565.6
The /H files do not work with Trinity, while the normal files after seqtk seq -C processing do.
This is tested with the latest version of fastool, compiled on Centos 6 with gcc 4.8.2