I have one paired end SAM file generated by Novoalign. The two ends of one pair share the same read ID, but the HTSeq reports all of them as "alignment_not_unique" and ignores them. Anybody know how to fix this issue? I used the following command,
htseq-count --stranded=no --mode=intersection-nonempty -t exon -i gene_id inputFile.sam Homo_sapiens.GRCh37.67.chr.gtf > inputFile.htseq.count
The following are 4 example reads.
HWI-ST1189:59:C1305ACXX:5:1101:10000:108281 99 chrM 7910 3 51M = 8156 0 CGAGTACACCGACTACGGCGGACTAATCTTCAACTCCTACA
TACTTCCCCC CCCFFFFFHHHHHJJJJJIJJJJIIJJIJJJJJJJIIHFHHHHFFFFFFED PG:Z:novoalignMPI NH:i:2 HI:i:1 AM:i:70 NM:i:0 SM:i:70 GN:Z:585 T
N:Z:ENST00000361739 ZN:i:2 PQ:i:6 UQ:i:6 AS:i:6 ZS:Z:R
HWI-ST1189:59:C1305ACXX:5:1101:10000:108281 147 chrM 8156 3 51M = 7910 0 GGTATACTACGGTCAATGCTCTGAAATCTGTGGAGCAAACC
ACAGTTTCAT IJJJHIJJIJJJJIJJJJJJJJJJJIIJJJJJJJJIIIHHHHHFFFFFCCC PG:Z:novoalignMPI NH:i:2 HI:i:1 AM:i:70 NM:i:0 SM:i:70 GN:Z:585 T
N:Z:ENST00000361739 ZN:i:2 PQ:i:6 UQ:i:0 AS:i:0 ZS:Z:R
HWI-ST1189:59:C1305ACXX:5:1101:10000:12609 163 chr6 74229694 1 51M = 74229740 0 CCCGAATCTACGTGTCCAATGACGA
CAATGTTGATATGAGTCTTTTCCTTT CCCFFFFFHHHHHIIJJJJIJJJJJJJJIJIIJJJJJJJJHIJJJJJIJJF PG:Z:novoalignMPI NH:i:3 HI:i:1 AM:i:70 NM:i:0 SM:i:70 G
N:Z:1151 TN:Z:ENST00000309268 ZN:i:3 PQ:i:9 UQ:i:0 AS:i:0 ZS:Z:R
HWI-ST1189:59:C1305ACXX:5:1101:10000:12609 83 chr6 74229740 1 40M943N11M = 74229694 0 CCTTTCCCATTTTGGCT
TTTAGGGGTAGTTTTCACGACACCTGTGTTCTGG IJJJJJJJJJJJJJJJJJJJJJJJJJIJJJJJJIJIIHDHHHHFFFFFCCC PG:Z:novoalignMPI NH:i:3 HI:i:1 AM:i:70 NM:i:0 S
M:i:70 GN:Z:1151 TN:Z:ENST00000309268 ZN:i:3 PQ:i:9 UQ:i:9 AS:i:9 ZS:Z:R XS:A:-
Thanks in advance!
htseq-count --stranded=no --mode=intersection-nonempty -t exon -i gene_id inputFile.sam Homo_sapiens.GRCh37.67.chr.gtf > inputFile.htseq.count
The following are 4 example reads.
HWI-ST1189:59:C1305ACXX:5:1101:10000:108281 99 chrM 7910 3 51M = 8156 0 CGAGTACACCGACTACGGCGGACTAATCTTCAACTCCTACA
TACTTCCCCC CCCFFFFFHHHHHJJJJJIJJJJIIJJIJJJJJJJIIHFHHHHFFFFFFED PG:Z:novoalignMPI NH:i:2 HI:i:1 AM:i:70 NM:i:0 SM:i:70 GN:Z:585 T
N:Z:ENST00000361739 ZN:i:2 PQ:i:6 UQ:i:6 AS:i:6 ZS:Z:R
HWI-ST1189:59:C1305ACXX:5:1101:10000:108281 147 chrM 8156 3 51M = 7910 0 GGTATACTACGGTCAATGCTCTGAAATCTGTGGAGCAAACC
ACAGTTTCAT IJJJHIJJIJJJJIJJJJJJJJJJJIIJJJJJJJJIIIHHHHHFFFFFCCC PG:Z:novoalignMPI NH:i:2 HI:i:1 AM:i:70 NM:i:0 SM:i:70 GN:Z:585 T
N:Z:ENST00000361739 ZN:i:2 PQ:i:6 UQ:i:0 AS:i:0 ZS:Z:R
HWI-ST1189:59:C1305ACXX:5:1101:10000:12609 163 chr6 74229694 1 51M = 74229740 0 CCCGAATCTACGTGTCCAATGACGA
CAATGTTGATATGAGTCTTTTCCTTT CCCFFFFFHHHHHIIJJJJIJJJJJJJJIJIIJJJJJJJJHIJJJJJIJJF PG:Z:novoalignMPI NH:i:3 HI:i:1 AM:i:70 NM:i:0 SM:i:70 G
N:Z:1151 TN:Z:ENST00000309268 ZN:i:3 PQ:i:9 UQ:i:0 AS:i:0 ZS:Z:R
HWI-ST1189:59:C1305ACXX:5:1101:10000:12609 83 chr6 74229740 1 40M943N11M = 74229694 0 CCTTTCCCATTTTGGCT
TTTAGGGGTAGTTTTCACGACACCTGTGTTCTGG IJJJJJJJJJJJJJJJJJJJJJJJJJIJJJJJJIJIIHDHHHHFFFFFCCC PG:Z:novoalignMPI NH:i:3 HI:i:1 AM:i:70 NM:i:0 S
M:i:70 GN:Z:1151 TN:Z:ENST00000309268 ZN:i:3 PQ:i:9 UQ:i:9 AS:i:9 ZS:Z:R XS:A:-
Thanks in advance!
Comment