SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   HTseq (http://seqanswers.com/forums/showthread.php?t=23104)

narges 09-06-2012 05:18 AM

HTseq
 
Hi all,
I have a BAM file from paired end tophat output and and I wanted to apply HTseq to count the mapped reads, because it was paired mate and also BAM, I did following steps but still I receive error:

samtools sort accepted_hits.bam accepted_hits.sorted

samtools view accepted_hits.sorted.bam | htseq-count - /hg19/Annotation/Genes/genes.gtf > count.txt

and this is the error :
Warning: Read ERR009097.6031922 claims to have an aligned mate which could not be found. (Is the SAM file properly sorted?)

Actually it was the same error I got before sorting so I did sorting but it is still the case.
And according to fastqc analysis for the left and right fastq files I have the same number of sequences.

Thanks for the help

billstevens 09-06-2012 05:49 AM

You didn't actually convert it to .sam. You should have wrote

samtools view accepted_hits.bam accepted_hits.sam

And then sort the acccepted_hits.sam file

narges 09-06-2012 07:02 AM

Quote:

Originally Posted by billstevens (Post 83316)
You didn't actually convert it to .sam. You should have wrote

samtools view accepted_hits.bam accepted_hits.sam

And then sort the acccepted_hits.sam file

Thank you, but I have used the same command, I mean like this :

samtools view accepted_hits.bam | htseq-count - /hg19/Annotation/Genes/genes.gtf > count.txt

for the single end dataset which did not ask for sorting and it worked.
Althigh I tried your advice and again there were more errors like :

[bam_index_load] fail to load BAM index.
[main_samview] random alignment retrieval only works for indexed BAM files.
And then I indexed the file : samtools index accepted_hits.bam

And then :

$ samtools view accepted_hits.bam.bai accepted_hits.sam
[bam_header_read] EOF marker is absent.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
[main_samview] fail to read the header from "accepted_hits.bam.bai".

The bam file I am using is the tophat output and I did not approach this problem of indexing and so on with this kind of result :(

dpryan 09-06-2012 07:03 AM

@narges: the problem was that you sorted by coordinate rather than read name, which is required for pair-end reads.

Try instead:
Code:

samtools sort -n accepted_hits.bam accepted_hist.sorted
samtools view accepted_hits.sorted.bam | htseq-count - /hg19/Annotation/Genes/genes.gtf > count.txt


narges 09-06-2012 07:46 AM

Quote:

Originally Posted by dpryan (Post 83328)
@narges: the problem was that you sorted by coordinate rather than read name, which is required for pair-end reads.

Try instead:
Code:

samtools sort -n accepted_hits.bam accepted_hist.sorted
samtools view accepted_hits.sorted.bam | htseq-count - /hg19/Annotation/Genes/genes.gtf > count.txt


Thank you so much, now it works


All times are GMT -8. The time now is 04:00 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.