SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
extract full fasta file for local blast hits Oyster Bioinformatics 9 07-07-2019 07:39 AM
extract alignment from SAM with a GFF file NicoBxl Bioinformatics 4 08-02-2011 01:45 PM
Extract perfectly mapped reads from SAM/BAM file Graham Etherington Bioinformatics 2 07-21-2011 07:27 AM
Filtering multi hits from sam file David Lyon Bioinformatics 1 05-27-2010 11:18 AM
hits information in sam file totalnew Bioinformatics 0 08-06-2009 01:15 PM

Reply
 
Thread Tools
Old 01-01-2011, 10:33 AM   #1
gfmgfm
Member
 
Location: il

Join Date: Jun 2010
Posts: 64
Default how to extract unique hits from a sam file

I aligned reads with bwa and I want to get a set of the reads that mapped uniquely to the genome.
I understood from samtools faq that they suggest to look at 'reliable' rather than `unique' by :

samtools view -bq 1 aln.bam > aln-reliable.bam

http://sourceforge.net/apps/mediawik...?title=SAM_FAQ

However, I am interested to get the subset of the uniquely mapped reads, in order to do some calculations on it.

How can one do it?
gfmgfm is offline   Reply With Quote
Old 01-02-2011, 04:39 AM   #2
drio
Senior Member
 
Location: 4117'49"N / 24'42"E

Join Date: Oct 2008
Posts: 323
Default

Take a look to this thread.

I'd suggest you follow bwa's FAQ advice and rely more in the MAPQ. Still, if you want to
filter uniquely mapped:

Code:
$ samtools view bwa.bam | grep "XT:A:U"
__________________
-drd
drio is offline   Reply With Quote
Old 01-02-2011, 10:13 AM   #3
gfmgfm
Member
 
Location: il

Join Date: Jun 2010
Posts: 64
Default

Thanks a lot!
gfmgfm is offline   Reply With Quote
Old 01-23-2011, 08:34 PM   #4
seq_GA
Senior Member
 
Location: Asiana

Join Date: Feb 2009
Posts: 124
Default

Hi,
I am also looking for such solution. But in my bam file, I don't see any
Code:
XT:A:U
instead its a SOLEXA single read where I have converted export to sam format as below.



Code:
test_1:8:69:19633:9434       0       chr10   5423197 4       45M     *       0       0       CACACAACCCCCACACCAAACACACACCCCCCACACACAACAAAC      0.2B90+)*=@8@################################   XD:Z:6C11C15G4CACT2     SM:i:4
test_1:8:56:11474:20981      0       chr10   7323903 6       45M     *       0       0       ATCAAGCGATCCTCCCACCTCATCCCCCTAAGTACCTGTGACTAA      757@54;3;1@@@@@22@###########################   XD:Z:22G11G3G5C SM:i:6
Any generic way of picking unique hits from sam file? Thanks.
seq_GA is offline   Reply With Quote
Old 01-23-2011, 09:48 PM   #5
gfmgfm
Member
 
Location: il

Join Date: Jun 2010
Posts: 64
Default

I am not an expert, but don't think you can extract this from the _export file. I think XT:A:U is specific to bwa (maybe also other aligners?).

Maybe you would like to use the _sorted file from the Illumina pipeline:
the desciption of the _sorted file- from CASAVA 1.7 manual p. 73:
This output file is similar to s_N_export.txt, except it contains
only entries for reads which pass purity filtering and have a
unique alignment in the reference. These are sorted by order
of their alignment position, which is meant to facilitate the
extraction of ranges of reads for purposes of visualization or
SNP calling.
These files are only produced if the flag WITH_SORTED is
used."

Alternatively, you can take your reads and align them with bwa (or with other aligner that gives you this info).
gfmgfm is offline   Reply With Quote
Old 01-25-2014, 03:55 AM   #6
emp
Member
 
Location: india

Join Date: Jan 2014
Posts: 11
Default

hello all,
I have Illumina data, which I mapped through Bowtie2 and got a sam file.

Now I need to extract the reads which are being shown uniquely one time.

Kindly guide if something could be done for that.

I have tried a bit of commands for the above, but failed to get those reads separated.

Kindly guide ASAP.
emp is offline   Reply With Quote
Old 01-25-2014, 06:06 PM   #7
Wallysb01
Senior Member
 
Location: San Francisco, CA

Join Date: Feb 2011
Posts: 286
Default

Quote:
Originally Posted by emp View Post
hello all,
I have Illumina data, which I mapped through Bowtie2 and got a sam file.

Now I need to extract the reads which are being shown uniquely one time.

Kindly guide if something could be done for that.

I have tried a bit of commands for the above, but failed to get those reads separated.

Kindly guide ASAP.
How about just use bowtie with the -m 1 option set?
Wallysb01 is offline   Reply With Quote
Old 01-26-2014, 10:13 PM   #8
emp
Member
 
Location: india

Join Date: Jan 2014
Posts: 11
Default

hello Wallysb01,

after searching a lot for Bowtie2 to get uniquely matched reads, I think bowtie 1 is the only way out.

thanku for the same.
emp is offline   Reply With Quote
Old 01-27-2014, 12:49 AM   #9
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

Unique alignments in bowtie2 have MAPQ>=2, so you can just filter the results by that.
dpryan is offline   Reply With Quote
Old 07-07-2019, 07:24 AM   #10
brojee
Member
 
Location: Bhopal

Join Date: Jul 2019
Posts: 19
Default

Possibly you might want to utilize the _sorted document from the Illumina pipeline:

the desciption of the _sorted document from CASAVA 1.7 manual p. 73:

This yield document is like s_N_export.txt, with the exception of it contains

entries for peruses which pass immaculateness separating and have a

special arrangement in the reference. These are arranged by request

of their arrangement position, which is intended to encourage the

extraction of scopes of peruses for motivations behind representation or

SNP calling.

These documents are possibly created if the banner WITH_SORTED is

utilized."

On the other hand, you can take your peruses and adjust them to bwa (or with other aligner that gives you this data).
brojee is offline   Reply With Quote
Old 07-07-2019, 07:33 AM   #11
brojee
Member
 
Location: Bhopal

Join Date: Jul 2019
Posts: 19
Default

Possibly you might want to utilize the _sorted document from the Illumina pipeline:

the desciption of the _sorted document from CASAVA 1.7 manual p. 73:

This yield document is like s_N_export.txt, with the exception of it contains

entries for peruses which pass immaculateness separating and have a

special arrangement in the reference. These are arranged by request

of their arrangement position, which is intended to encourage the

extraction of scopes of peruses for motivations behind representation or

SNP calling.

These documents are possibly created if the banner WITH_SORTED is

utilized."

On the other hand, you can take your peruses and adjust them to bwa (or with other aligner that gives you this data).
brojee is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:10 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO