SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Bowtie can't read index files anecsulea Bioinformatics 9 02-20-2013 09:31 PM
PubMed: ?-RA: a parallel sparse index for genomic read alignment. Newsbot! Literature Watch 0 01-27-2012 11:01 PM
Index read problem MLog Illumina/Solexa 5 11-28-2011 07:55 AM
Index of all tools supporting SAM? krobison Wiki Discussion 5 03-01-2011 03:04 AM
Failed Index Read - how to restart? coleen_2 Illumina/Solexa 2 02-19-2011 10:57 AM

Reply
 
Thread Tools
Old 07-08-2010, 04:06 AM   #1
genelab
Member
 
Location: honston

Join Date: Nov 2009
Posts: 27
Default How to quick index the sam record according to the read name?

How to quick index the sam record according to the read name?

assum i hava a read which named "afaNma_1", and this read has record in the sam format file;

I want to index and get the samRecord of read "afaNma_1" in this sam file quickly, Can anyone tell me how should i do?


Thanks
genelab is offline   Reply With Quote
Old 07-08-2010, 10:28 AM   #2
Bio.X2Y
Member
 
Location: Europe

Join Date: Apr 2010
Posts: 46
Default

I'm not aware of anything convenient for doing this, but someone else might be able to shed light.

If you are comfortable with programming, I'd sort the file by name, leaving the header records on top (samtools probably has something for this). Then write a program to pluck out your target record using a binary search algorithm (http://en.wikipedia.org/wiki/Binary_search_algorithm). Java has a RandomAccessFile class for quickly accessing arbitary file bytes, though I'm sure other languages have their equivalents. The tricky part will be finding the start of a record containing an arbitary byte - you will have to work backwards to find a newline or the start-of-file.

I know this isn't creating an index, but it should be lightning fast for practical purposes.
Bio.X2Y is offline   Reply With Quote
Old 07-09-2010, 07:17 AM   #3
brentp
Member
 
Location: salt lake city, UT

Join Date: Apr 2010
Posts: 72
Default

you can use this
http://github.com/brentp/bio-playgro...ter/fileindex/
if you have python and tokyo-cabinet
brentp is offline   Reply With Quote
Old 07-13-2010, 03:46 AM   #4
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

I have some still experimental code for SAM/BAM with indexing by name for Biopython here: http://github.com/peterjc/biopython/...-sam-bam-index
maubp is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:42 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO