Hello,
this is a problem I have been dealing with now for a long time. I try to map SOLID PE reads using TopHat v 1.1.4 using this command:
All 4 input files are sorted and the reads are named like this:
I have also tried using identical names for the reads (without /1 and /2).
Everything seems to work ok:
The problem arises when I look at the output: The reads are not mapped in pairs, but in single reads of 50 and 25 bp each.
samtools view reveals:
All flags are either 16 or 0, which cannot be true for a paired read. Do you have any idea what is going on?
All the best,
Seb
this is a problem I have been dealing with now for a long time. I try to map SOLID PE reads using TopHat v 1.1.4 using this command:
Code:
tophat --color --quals -r 155 -p 4 -o /data/genetics/ -G /home/schaefer/tophat/Rattus_norvegicus.RGSC3.4.56.tophat.gtf rn4_c /data/genetics/1sort_1.csfasta,/data/genetics/1sort_2.csfasta /data/genetics/analyses/1sort_1.qual,/data/genetics/1sort_2.qual
Code:
1sort_1.csfasta >1000_1000_1070/1 T20010011233323011122110001120331100111211132111111 >1000_1000_1179/1 T31033102020100001322200230302001331230303321323200 >1000_1000_1947/1 T03313223130010310121222000210123010111002233323320 1sort_2.csfasta >1000_1000_1070/2 G1011001131233111011133111 >1000_1000_1179/2 G2222132322022032303223131 >1000_1000_1947/2 G0100011002000120101320013 1sort_1.qual >1000_1000_1070/1 32 24 32 29 22 6 11 29 10 25 14 20 31 20 14 33 32 26 6 5 6 6 32 22 8 33 10 28 21 14 8 8 23 25 14 22 19 26 10 8 14 17 23 23 32 4 32 24 8 32 >1000_1000_1179/1 31 33 32 31 33 27 33 33 31 33 33 27 32 32 33 33 32 33 31 32 32 31 33 28 32 33 27 30 31 32 33 22 31 32 29 26 31 22 29 24 32 33 20 16 33 25 33 12 31 29 >1000_1000_1947/1 33 33 33 30 33 33 28 33 31 33 26 32 26 24 33 29 33 31 20 8 33 33 30 33 29 31 32 33 29 31 29 32 24 14 17 33 30 28 27 27 33 31 28 33 33 30 31 8 24 12 1sort_2.qual >1000_1000_1070/2 32 25 33 33 8 31 11 10 6 32 33 33 6 8 8 32 28 22 30 20 32 8 12 10 14 >1000_1000_1179/2 30 29 30 33 25 16 29 33 33 26 31 30 23 29 30 25 32 31 25 10 27 30 24 31 29 >1000_1000_1947/2 33 33 32 27 32 33 33 23 24 32 33 33 19 16 30 31 27 22 14 31 19 17 20 6 29 >1000_1000_254/2 15 16 18 25 31 30 8 31 8 4 10 20 28 32 31 30 4 31 31 12 10 4 17 19 9 >1000_1000_358/2 5 32 12 11 7 8 18 9 7 8 4 27 4 6 4 4 4 4 5 6 4 21 4 4 4
Everything seems to work ok:
Code:
[Mon Dec 13 11:52:28 2010] Beginning TopHat run (v1.1.4) ----------------------------------------------- [Mon Dec 13 11:52:29 2010] Preparing output location /data/genetics// [Mon Dec 13 11:52:29 2010] Checking for Bowtie index files [Mon Dec 13 11:52:29 2010] Checking for reference FASTA file [Mon Dec 13 11:52:29 2010] Checking for Bowtie Bowtie version: 0.12.3.0 [Mon Dec 13 11:52:29 2010] Checking for Samtools Samtools version: 0.1.8.0 [Mon Dec 13 11:52:36 2010] Checking reads min read length: 25bp, max read length: 50bp format: fasta [Mon Dec 13 11:58:39 2010] Reading known junctions from GTF file [Mon Dec 13 12:13:59 2010] Mapping reads against rn4_c with Bowtie [Mon Dec 13 15:08:47 2010] Joining segment hits [Mon Dec 13 15:35:40 2010] Mapping reads against rn4_c with Bowtie(1/2) [Mon Dec 13 17:16:18 2010] Mapping reads against rn4_c with Bowtie(2/2) [Mon Dec 13 18:29:19 2010] Searching for junctions via segment mapping [Mon Dec 13 20:05:16 2010] Retrieving sequences for splices [Mon Dec 13 20:07:55 2010] Indexing splices Warning: Encountered reference sequence with only gaps Warning: Encountered reference sequence with only gaps [Mon Dec 13 20:23:43 2010] Mapping reads against segment_juncs with Bowtie [Mon Dec 13 21:20:19 2010] Mapping reads against segment_juncs with Bowtie [Mon Dec 13 22:10:15 2010] Joining segment hits [Mon Dec 13 22:19:17 2010] Reporting output tracks ----------------------------------------------- Run complete [10:55:26 elapsed]
samtools view reveals:
Code:
33_119_59 16 chr1 226 1 25M * 0 0 GATTTGAGTGCAGAGAGCTGTGTGA !!#+))))!!)))))))))****FF NM:i:0 NH:i:3 CC:Z:= CP:i:70416 478_1485_267 16 chr1 3060 0 25M * 0 0 AGACATGGAGAAAGAACACTCCTCC //6.(BI:<;07A<51!.JNJ:::: NM:i:0 NH:i:5 CC:Z:= CP:i:73804 980_603_1244 16 chr1 3884 0 50M * 0 0 TGTATATGTATATATATATGCATATATATATATATATATATACATACGTA !!!+/2&!!)-/+++))*"&&!)))**),41**)!!))!!++)))))))) NM:i:0 NH:i:16 CC:Z:= CP:i:61895914
All the best,
Seb
Comment