Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
I had a very similar problem which was very helpfully fixed using Trimmomatic and TrimGalore as detailed in this thread:
http://seqanswers.com/forums/showthread.php?t=19874
The author of TrimGalore was particularly accommodating in modifying the script to allow different trimming of R1 and R2.
Comment
-
Does discarding the size estimate affect anything with the read data, the quality, or any potential variant calls?
I am trying to determine if I should use the -A option for all of my data or if there is a way to dynamically determine that sampe will take forever and the -A option should be used.
Thanks.
Comment
-
Originally posted by rskr View PostI have seen it when one of the pairs was quality filtered but the other then it gets replaced with whatever was next in the file so, it not longer matches.
1.1 1.2
2.1 2.2
3.1 3.2
4.1 5.2 <--4.2 was omitted, they are no longer in parity.
5.1 6.2
I have a question regarding using the -A option in the case above. If the reads are out of sync, as is the case between 4.1 and 5.2, bwa will not perform SW on the unmapped mate. What happens after that? will 5.1 and 6.2 be thrown away also bc they do not match...etc? I guess what I am asking is, is it dangerous to use -A and force bwa to throw away unmatched pairs. Are we losing important data by doing this? And is the mismatch something that carries on to all the reads after the mismatch?
Comment
-
-A should really only be used if you know that your files are lined up right, and you know that the insert sizes won't properly match what bwa is expecting.
Fix your fastqs. You can pull out the singletons, align them separately, then combine the bams.
Comment
-
Originally posted by swbarnes2 View Post-A should really only be used if you know that your files are lined up right, and you know that the insert sizes won't properly match what bwa is expecting.
Fix your fastqs. You can pull out the singletons, align them separately, then combine the bams.
I am looking for more details on this but havent found it yet. If anyone can confirm that only the true singletons are ignored, then I guess -A would be a good solution. In the meantime, I think barnes' advice is the safest.Last edited by dGho; 07-11-2013, 05:43 AM.
Comment
-
Originally posted by swbarnes2 View Post-A should really only be used if you know that your files are lined up right, and you know that the insert sizes won't properly match what bwa is expecting.
Fix your fastqs. You can pull out the singletons, align them separately, then combine the bams.
Comment
-
Originally posted by swbarnes2 View Post-A should really only be used if you know that your files are lined up right, and you know that the insert sizes won't properly match what bwa is expecting.
Fix your fastqs. You can pull out the singletons, align them separately, then combine the bams.
Comment
-
Originally posted by dGho View PostI have a question regarding using the -A option in the case above. If the reads are out of sync, as is the case between 4.1 and 5.2, bwa will not perform SW on the unmapped mate. What happens after that? will 5.1 and 6.2 be thrown away also bc they do not match...etc? I guess what I am asking is, is it dangerous to use -A and force bwa to throw away unmatched pairs. Are we losing important data by doing this? And is the mismatch something that carries on to all the reads after the mismatch?
Comment
-
answering my own question, but if anyone else is looking for a way remove singletons, check out this thread. I am trying this out now. azneto shared his script for making sure that two fastqs are in sync. It seems to use a whole lot of ram though
http://seqanswers.com/forums/showthread.php?t=17974
Comment
-
I just wanted to confirm that azneto's script worked well. It removed singletons and ordered the two fastq files so reads were synchronized. Running bwa sampe on the resulting fastqs produced no errors and had runtimes that feel within the expected range
.
Comment
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 08:47 AM
|
0 responses
12 views
0 likes
|
Last Post
by seqadmin
Yesterday, 08:47 AM
|
||
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
60 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
60 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
||
Started by seqadmin, 04-10-2024, 09:21 AM
|
0 responses
54 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 09:21 AM
|
Comment