![]() |
HTseq
Hi all,
I have a BAM file from paired end tophat output and and I wanted to apply HTseq to count the mapped reads, because it was paired mate and also BAM, I did following steps but still I receive error: samtools sort accepted_hits.bam accepted_hits.sorted samtools view accepted_hits.sorted.bam | htseq-count - /hg19/Annotation/Genes/genes.gtf > count.txt and this is the error : Warning: Read ERR009097.6031922 claims to have an aligned mate which could not be found. (Is the SAM file properly sorted?) Actually it was the same error I got before sorting so I did sorting but it is still the case. And according to fastqc analysis for the left and right fastq files I have the same number of sequences. Thanks for the help |
You didn't actually convert it to .sam. You should have wrote
samtools view accepted_hits.bam accepted_hits.sam And then sort the acccepted_hits.sam file |
Quote:
samtools view accepted_hits.bam | htseq-count - /hg19/Annotation/Genes/genes.gtf > count.txt for the single end dataset which did not ask for sorting and it worked. Althigh I tried your advice and again there were more errors like : [bam_index_load] fail to load BAM index. [main_samview] random alignment retrieval only works for indexed BAM files. And then I indexed the file : samtools index accepted_hits.bam And then : $ samtools view accepted_hits.bam.bai accepted_hits.sam [bam_header_read] EOF marker is absent. [bam_header_read] invalid BAM binary header (this is not a BAM file). [main_samview] fail to read the header from "accepted_hits.bam.bai". The bam file I am using is the tophat output and I did not approach this problem of indexing and so on with this kind of result :( |
@narges: the problem was that you sorted by coordinate rather than read name, which is required for pair-end reads.
Try instead: Code:
samtools sort -n accepted_hits.bam accepted_hist.sorted |
Quote:
|
All times are GMT -8. The time now is 04:00 AM. |
Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.