SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics
Similar Threads
Thread Thread Starter Forum Replies Last Post
HTSeq question: high number of missing mates/unmatched reads TimK Bioinformatics 7 06-23-2016 01:08 PM
pacBioToCa output reduced reads mht Bioinformatics 2 08-22-2013 05:12 PM
sam-dump -u from sra file fails to output unaligned reads yishaishimoni Bioinformatics 0 02-08-2013 12:47 PM
BWASW more reads in the output SAM file than in the input file nanto Bioinformatics 2 09-18-2012 12:41 AM
bowtie sam output, number of mismatches sridharacharya Bioinformatics 2 01-08-2011 05:22 PM

Reply
 
Thread Tools
Old 03-02-2015, 02:12 PM   #1
capricy
Senior Member
 
Location: 63130

Join Date: Apr 2012
Posts: 125
Default htseq-out reduced number of the reads in the output sam file

Hello, there,

My htseq-count version: HTSeq-0.5.4p1

My input is a name sorted sam file:
===
HWI-ST514:143982632:C37PRACXX:7:2316:9998:87373 73 A1.0_Cont232 361849 50 47M56N6M * 0 0 GTATTGACCGTTTGACCATGATCCTCACTGACAGTAACAATATCAAGGACGTT IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII MD:Z:53 XG:i:0 NH:i:1 NM:i:0 XM:i:0 XO:i:0 AS:i:0 XS:A:+
HWI-ST514:143982632:C37PRACXX:7:2316:9998:94380 73 A1.0_Cont5 87069 50 47M66N18M * 0 0 TCCACAGCATGGCTGGAGCTGTTCAAGCGAAGTCTTACAGCCCTTTGATGGGCTTGCTCTACGTC IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII MD:Z:30C34 XG:i:0 NH:i:1 NM:i:1 XM:i:1 XO:i:0 AS:i:-6 XS:A:+
HWI-ST514:143982632:C37PRACXX:7:2316:9999:32498 153 A1.0_Cont103 405380 50 28M477N30M * 0 0 CAGAAAGCACAGTGTTGGCGTAGAGGTCCTTTCTGATATCAACGTCGCACTTCATGAT IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII MD:Z:58 XG:i:0 NH:i:1 NM:i:0 XM:i:0 XO:i:0 AS:i:0 XS:A:-
==

the command line is:

htseq-count -s no -a 10 -t coding_exon -i Parent -o nameSorted.accepted_hits.bam.sam.htseq.exon.sam nameSorted.accepted_hits.bam.sam a.gff

The I counted the reads in both nameSorted.accepted_hits.bam.sam, and nameSorted.accepted_hits.bam.sam.htseq.exon.sam, and found they did not tally:

grep "HWI" nameSorted.accepted_hits.bam.sam.htseq.exon.sam|cut -f5|sort|uniq -c
24579 0
165936 1
344984 3
2533416 50
grep "HWI" nameSorted.accepted_hits.bam.sam|cut -f5|sort|uniq -c
24579 0
165936 1
344984 3
2605593 50

It looks like the new sam file: nameSorted.accepted_hits.bam.sam.htseq.exon.sam, lost many reads with
quality score of 50

Could anyone tell me why?

Thank you very much for your time!
capricy is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



All times are GMT -8. The time now is 10:33 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2022, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO