![]() |
How to quick index the sam record according to the read name?
How to quick index the sam record according to the read name?
assum i hava a read which named "afaNma_1", and this read has record in the sam format file; I want to index and get the samRecord of read "afaNma_1" in this sam file quickly, Can anyone tell me how should i do? Thanks |
I'm not aware of anything convenient for doing this, but someone else might be able to shed light.
If you are comfortable with programming, I'd sort the file by name, leaving the header records on top (samtools probably has something for this). Then write a program to pluck out your target record using a binary search algorithm (http://en.wikipedia.org/wiki/Binary_search_algorithm). Java has a RandomAccessFile class for quickly accessing arbitary file bytes, though I'm sure other languages have their equivalents. The tricky part will be finding the start of a record containing an arbitary byte - you will have to work backwards to find a newline or the start-of-file. I know this isn't creating an index, but it should be lightning fast for practical purposes. |
you can use this
http://github.com/brentp/bio-playgro...ter/fileindex/ if you have python and tokyo-cabinet |
I have some still experimental code for SAM/BAM with indexing by name for Biopython here: http://github.com/peterjc/biopython/...-sam-bam-index
|
All times are GMT -8. The time now is 10:45 PM. |
Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.