I am pretty desperate, so if anyone has any advice, I will be forever grateful.
Basically, I am using the following command:
htseq-count -s no -a 0 FourA.sam hg19.gtf > FourA.count
and getting zero counts in the output. I have tried -s as yes/no (I am almost sure it should be "yes" based off the site I got my files), and -a between 0 and 10. No matter what parameters, I get zero counts.
This is for a quick project, so I am not going for accuracy, I just want any results (of any quality)!
Here is the background of my data
1) Downloaded .sra file.
2) Used fastq-dump to convert .sra file to .fastq file
3) Only kept 1/10 sequences from .fastq file (it was too large).
4) On Galaxy: Ran "Fastq Groomer" tool to remove the leading quality score for the adaptor base (input 'Color Space Sanger', default for the rest)
5) On Galaxy: Ran "Manipulation Reads" to convert to base-space "0123. -> ATCGN" and remove the adaptor itself. This generated fastqcssanger file.
6) On Galaxy: Ran ""Map with BWA for SOLiD" tool on the fastqcssanger file and mrna.fa file to get the SAM file
I downloaded the mrna.fa file from: (http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/)
Any advice is greatly appreciated. I don't think I have time to rerun/remap. If there is any faster solution for where I went wrong in this pipeline, I would need to do that. I just need something to present, regardless of the quality, and will admit that to my class!
(Also, I am getting an warning "Warning: Malformed SAM line: RNAME != '*' although flag bit &0x0004 set", although I read in other forums that is usually ignorable).
Basically, I am using the following command:
htseq-count -s no -a 0 FourA.sam hg19.gtf > FourA.count
and getting zero counts in the output. I have tried -s as yes/no (I am almost sure it should be "yes" based off the site I got my files), and -a between 0 and 10. No matter what parameters, I get zero counts.
This is for a quick project, so I am not going for accuracy, I just want any results (of any quality)!
Here is the background of my data
1) Downloaded .sra file.
2) Used fastq-dump to convert .sra file to .fastq file
3) Only kept 1/10 sequences from .fastq file (it was too large).
4) On Galaxy: Ran "Fastq Groomer" tool to remove the leading quality score for the adaptor base (input 'Color Space Sanger', default for the rest)
5) On Galaxy: Ran "Manipulation Reads" to convert to base-space "0123. -> ATCGN" and remove the adaptor itself. This generated fastqcssanger file.
6) On Galaxy: Ran ""Map with BWA for SOLiD" tool on the fastqcssanger file and mrna.fa file to get the SAM file
I downloaded the mrna.fa file from: (http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/)
Any advice is greatly appreciated. I don't think I have time to rerun/remap. If there is any faster solution for where I went wrong in this pipeline, I would need to do that. I just need something to present, regardless of the quality, and will admit that to my class!
(Also, I am getting an warning "Warning: Malformed SAM line: RNAME != '*' although flag bit &0x0004 set", although I read in other forums that is usually ignorable).
Comment