SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
'Properly paired' reads in sam flag from TopHat mapping AdamB Bioinformatics 9 03-08-2012 08:30 AM
Inconsistency with SAM flag output? mhayes Bioinformatics 7 11-18-2011 07:28 PM
BWA output bitwise flag for mapped/unmapped reads wenhuang Bioinformatics 1 08-29-2011 04:54 PM
Is * really a valid value for a SAM FLAG field? derobins Bioinformatics 1 01-20-2011 10:06 AM
bfast for unmapped reads Protaeus Bioinformatics 2 11-17-2010 03:35 PM

Reply
 
Thread Tools
Old 05-27-2010, 12:57 AM   #1
aiden
Junior Member
 
Location: Melbourne, Australia

Join Date: May 2010
Posts: 3
Default SAM flag field and removing unmapped reads from BFAST output

Hi there,

I'm using BFAST to align Solexa reads to a very small portion of a genome (~3kb), and have been considering the best way to remove unmapped reads from the output since these unnecessarily bulk up the output .sam file. I know that samtools can filter an incoming .sam file using the -F command. However, I've read some documentation on the SAM flag format and must admit I find it pretty confusing. Within the flag field I know there are fields for both "the mate is unmapped" and "the query sequence itself is unmapped", but for non-paired-end Solexa reads can either of these be used for removing unmapped reads? Furthermore, what would be the integer or string used in the -F command?

Alternatively, there is the option in samtools view to filter by map quality (MAPQ). Would setting map quality filter to e.g. 1 remove all unmapped reads without affecting the filtered alignment from BFAST postprocess?

Alternatively again, dbamfilter within the DNAA package has the capacity to remove unmapped reads, but if samtools can do the job I'd like to minimise the number of apps employed.

What are thoughts on the best strategy?
Aiden
aiden is offline   Reply With Quote
Old 05-27-2010, 01:57 AM   #2
Bruins
Member
 
Location: Groningen

Join Date: Feb 2010
Posts: 78
Default

Hi Aiden,

I hope I'm not staing the obvious here, but are you familiar with Picard? They are some Java-based commandline tools to manipulate sam files and one of those may help you: ViewSam.jar. It basically prints a sam or bam file to the screen but you can set a flag to report all reads, just the aligned reads or just the unaligned reads.
Take a look: http://picard.sourceforge.net/

Cheers,
Wil
Bruins is offline   Reply With Quote
Old 05-27-2010, 07:04 AM   #3
drio
Senior Member
 
Location: 4117'49"N / 24'42"E

Join Date: Oct 2008
Posts: 323
Default

Quote:
Originally Posted by aiden View Post
Hi there,

I'm using BFAST to align Solexa reads to a very small portion of a genome
(~3kb), and have been considering the best way to remove unmapped reads from
the output since these unnecessarily bulk up the output .sam file. I know that
samtools can filter an incoming .sam file using the -F command. However, I've
read some documentation on the SAM flag format and must admit I find it pretty
confusing. Within the flag field I know there are fields for both "the mate is
unmapped" and "the query sequence itself is unmapped", but for non-paired-end
Solexa reads can either of these be used for removing unmapped reads?
Look at the BAM spec 2.2.2 (Notes):

Code:
1. Flag 0x02, 0x08, 0x20, 0x40 and 0x80 are only meaningful when flag 0x01 is present.
Assuming you are using Fragment data, you want to filter using the 0x0004 flag.

Quote:
Furthermore, what would be the integer or string used in the -F command?
Code:
$ samtools view -F 4 ./foo.bam # display mapped reads only
$ samtools view -f 4 ./foo.bam # display unmapped reads only
Quote:
Alternatively, there is the option in samtools view to filter by map quality
(MAPQ). Would setting map quality filter to e.g. 1 remove all unmapped reads
without affecting the filtered alignment from BFAST postprocess?
Go for samtools as suggested.
You need to do the postprocessing prior to be able to filter your reads anyway.

Quote:
Alternatively again, dbamfilter within the DNAA package has the capacity to
remove unmapped reads, but if samtools can do the job I'd like to minimise the number of apps employed.
samtools can do the job. But give dnaa a try too.
__________________
-drd
drio is offline   Reply With Quote
Old 05-27-2010, 07:10 PM   #4
aiden
Junior Member
 
Location: Melbourne, Australia

Join Date: May 2010
Posts: 3
Default

Thanks for the very helpful replies, much appreciated.
aiden is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:20 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO