Hi there,
Apologies if I missing something obvious here...
In the SAM output of tophat (accepted_hits.sam) I found read pairs where the two reads mapped to different chromosomes (3rd column). However the column MRNM (mate reference sequence, 7th column) has '=' meaning both reads mapped to the same reference.
Here's an example:
I discovered this while running HTSeq to count reads mapping to GTF features. I get the warning:
For info, this is the tophat command I used:
Is this a bug in tophat (tophat-1.0.13)? Does anyone know whether such pairs are reliable?
Many thanks
Dario
Apologies if I missing something obvious here...
In the SAM output of tophat (accepted_hits.sam) I found read pairs where the two reads mapped to different chromosomes (3rd column). However the column MRNM (mate reference sequence, 7th column) has '=' meaning both reads mapped to the same reference.
Here's an example:
Code:
EBRI093151_0001:8:100:268:457#NNNNNN 145 10 65535404 255 35M = 85418059 0 ATTTGCACTATTACACTTAAATTGTTATCCTTTTT _aaa`b^baaaa_b_baabaaababababbbabba NM:i:2 EBRI093151_0001:8:100:268:457#NNNNNN 97 5 85418059 255 35M = 65535404 0 ATTTTTACATCAGTCCCTTTAACACAAATCCATAT aabaabaabaaaa^a_a`a`^``a_a_aa_^]`aa NM:i:0
Code:
Warning: Incorrect 'proper_pair' flag value for read pair EBRI093151_0001:8:100:269:1653#NNNNNN
Code:
tophat \ --output-dir /exports/.../20100122_RNAseq_CTRL/ \ --mate-inner-dist 130 \ --mate-std-dev 30 \ --solexa1.3-quals \ --GFF /exports/.../Sus_scrofa.Sscrofa9.56.gff3 \ /exports/.../bowtie/current/indexes/Sscrofa9.56_plus_Human_ribosomal_DNA_complete_repeating_unit \ /exports/.../RNAseq_CTRL_1_35bp.fastq \ /exports/.../RNAseq_CTRL_2_35bp.fastq
Many thanks
Dario
Comment