Hey !
We have to map some Illumina Matepair reads and we have problems with the results that we are getting from different mappers. The insert size is 2000 BP per Matepair read.
We have tried BWA, Ssaha2 and the brand new razerS3. So far we have results from Ssaha2 and BWA. BWA was mapped as reverse complement and normal reads as well in order to determine if the mapper works correctly with the Matepairs.
We are aware of the possible paired end contamination. In total we have 80 Million reads and roughly 8 million reads with BWA (normal reads) and 7.7 million (reverse complemented reads) could be identified as Matepairs.
Ssaha2 on the other hand detects 48 million reads with normal reads (reverse complement not finished yet).
Here are some quick statistics:
BWA
Genome RC vs Genome N
# N mapped reads 8.262.260
# RC mapped reads 7.765.615
# similar reads (name,diff of distances <=5) 7.178.290 (92% of RC reads,87% of N)
Ssaha vs BWA
Genome_ssaha vs Genome_bwa Normal Reads
# mapped reads in BWA 8.262.260
# mapped reads in Ssaha 48.340.873
# similar reads (name, diff of distances <=5) 7.402.903
What do you guys think of the results? Can BWA work with Matepair reads? How can we verify our results? Is the amount of Matepairs mapped by Ssaha2 belivable?
Thank you in advanced
We have to map some Illumina Matepair reads and we have problems with the results that we are getting from different mappers. The insert size is 2000 BP per Matepair read.
We have tried BWA, Ssaha2 and the brand new razerS3. So far we have results from Ssaha2 and BWA. BWA was mapped as reverse complement and normal reads as well in order to determine if the mapper works correctly with the Matepairs.
We are aware of the possible paired end contamination. In total we have 80 Million reads and roughly 8 million reads with BWA (normal reads) and 7.7 million (reverse complemented reads) could be identified as Matepairs.
Ssaha2 on the other hand detects 48 million reads with normal reads (reverse complement not finished yet).
Here are some quick statistics:
BWA
Genome RC vs Genome N
# N mapped reads 8.262.260
# RC mapped reads 7.765.615
# similar reads (name,diff of distances <=5) 7.178.290 (92% of RC reads,87% of N)
Ssaha vs BWA
Genome_ssaha vs Genome_bwa Normal Reads
# mapped reads in BWA 8.262.260
# mapped reads in Ssaha 48.340.873
# similar reads (name, diff of distances <=5) 7.402.903
What do you guys think of the results? Can BWA work with Matepair reads? How can we verify our results? Is the amount of Matepairs mapped by Ssaha2 belivable?
Thank you in advanced
Comment