Seqanswers Leaderboard Ad

**luc** · 09-29-2021, 10:22 AM

Sorry, I have no idea what could be happening here. Have you checked the quality of both forward and the reverse reads with FASTQC, FASTP or similar?

**rachpetersen** · 10-11-2021, 11:39 AM

Thanks for your response, luc!

I tried mapping the R1 and R2 files separately in single end mode, and interestingly, got really high mapping percentages. Using the same sample that I gave the output for in my original posting after running paired end mapping, here is the bamtools stats output for the R1 and R2 files separately:

R1 single end mapping
**********************************************
Stats for BAM file(s):
**********************************************

Total reads: 99555409
Mapped reads: 96735523 (97.1675%)
Forward strand: 51635582 (51.8662%)
Reverse strand: 47919827 (48.1338%)
Failed QC: 0 (0%)
Duplicates: 0 (0%)
Paired-end reads: 0 (0%)

R2 single end mapping
**********************************************
Stats for BAM file(s):
**********************************************

Total reads: 99786078
Mapped reads: 97634672 (97.844%)
Forward strand: 50328230 (50.4361%)
Reverse strand: 49457848 (49.5639%)
Failed QC: 0 (0%)
Duplicates: 0 (0%)
Paired-end reads: 0 (0%)

I'm a little confused by this, because when I check the R1 and R2 files, there are the same number of sequences, yet in this output it looks like there are a different number of "Total Reads" in the two files. I also tried sorting the seqs by name in each file prior to running the paired end mapping again, but that didn't help the issue of low mapping + many more forward strand vs reverse strand reads.

Any advice would be greatly appreciated! Thanks in advance for your help!

**HESmith** · 10-12-2021, 06:17 AM

Your reads have been processed in some manner (i.e., filtered by quality) so that R1 and R2 are no longer properly paired. For unprocessed reads, the number of R1 and R2 should be identical, and you should have zero singletons: the stats in your first post indicate otherwise. Sorting by name does not help b/c, as soon as this first singleton is encountered, all subsequent reads are out of register/mispaired.

Apparently, your aligner constrains the R2 search space based on R1 alignment; since R1 and R2 are mispaired, it is unable to find a match for R2 in that space.

The solution is to fix pairing with BBTools Repair, then repeat the alignment. Report back whether or not that solved your problem.

**luc** · 10-13-2021, 06:01 PM

I very much agree. Thanks HESmith!

Originally posted by HESmith View Post

Your reads have been processed in some manner (i.e., filtered by quality) so that R1 and R2 are no longer properly paired. For unprocessed reads, the number of R1 and R2 should be identical, and you should have zero singletons: the stats in your first post indicate otherwise. Sorting by name does not help b/c, as soon as this first singleton is encountered, all subsequent reads are out of register/mispaired. ...

Topics	Statistics	Last Post
A Closer Look at the Enigmatic Genomes of Oikopleura dioica by seqadmin Started by seqadmin, Yesterday, 06:35 AM	0 responses 14 views 0 likes	Last Post by seqadmin Yesterday, 06:35 AM
Advanced Epigenome Editing Platform Explores Gene Regulation Mechanisms by seqadmin Started by seqadmin, 05-09-2024, 02:46 PM	0 responses 18 views 0 likes	Last Post by seqadmin 05-09-2024, 02:46 PM
Telomere Maintenance by PARP1: A New Perspective in Cancer Research by seqadmin Started by seqadmin, 05-07-2024, 06:57 AM	0 responses 17 views 0 likes	Last Post by seqadmin 05-07-2024, 06:57 AM
Enhanced Neoantigen Detection: Introducing NeoHunter by seqadmin Started by seqadmin, 05-06-2024, 07:17 AM	0 responses 19 views 0 likes	Last Post by seqadmin 05-06-2024, 07:17 AM

Seqanswers Leaderboard Ad

Announcement

Paired-end RNA-seq: large discrepancy in number of forward versus reverse reads

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News