![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
TopHat Error: Could not find Bowtie index files /bowtie-0.12.5/indexes/. | rebrendi | Bioinformatics | 11 | 06-22-2016 10:55 AM |
bowtie index problem (bowtie-build and then bowtie-inspect) | tgenahmet | Bioinformatics | 4 | 09-10-2013 12:51 PM |
New dual index Nextera TruSeq adapter sequences? | koadman | Illumina/Solexa | 3 | 08-29-2012 06:17 PM |
Getting a list of all index sequences | Mouth_Breather | Illumina/Solexa | 3 | 07-12-2012 03:26 AM |
reverse index for bowtie | jay2008 | Bioinformatics | 0 | 06-05-2012 05:11 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Colorado Join Date: Feb 2013
Posts: 12
|
![]()
We have the fasta files (obviously) for the hg19 genome, we used them to create a big Bowtie index.
I was hoping not to have to keep the fasta file. Instead just look up sequences in the Bowtie index when I get chromosome locations. I know when the alignment comes back it tells me where the alignment occurs and which fasta record (header) that it came from. So all the info is there, but I can't figure out how to pull out a sequence given a location. Does anyone know if this is possible, or know much about the index format (perhaps I could write a little program to fish out a sequence)? Thanks |
![]() |
![]() |
![]() |
#2 | |
Member
Location: US Join Date: Sep 2012
Posts: 91
|
![]() Quote:
Code:
bowtie2-inspect No index name given! Bowtie 2 version 2.1.0 by Ben Langmead (langmea@cs.jhu.edu, www.cs.jhu.edu/~langmea) Usage: bowtie2-inspect [options]* <bt2_base> <bt2_base> bt2 filename minus trailing .1.bt2/.2.bt2 By default, prints FASTA records of the indexed nucleotide sequences to standard out. With -n, just prints names. With -s, just prints a summary of the index parameters and sequences. With -e, preserves colors if applicable. Options: -a/--across <int> Number of characters across in FASTA output (default: 60) -n/--names Print reference sequence names only -s/--summary Print summary incl. ref names, lengths, index properties -e/--bt2-ref Reconstruct reference from .bt2 (slow, preserves colors) -v/--verbose Verbose output (for debugging) -h/--help print detailed description of tool and its options --help print this usage message |
|
![]() |
![]() |
![]() |
#3 |
Member
Location: Colorado Join Date: Feb 2013
Posts: 12
|
![]()
Just to clarify, I mean using the index - giving it a chromosome name (fasta header) and location numbers, and getting back a sequence.
I don't want to run an alignment, just pull out the sequence. So no SAM output. For this I'm using bowtie, not bowtie2. But of bowtie2 can do this... Thanks |
![]() |
![]() |
![]() |
#4 |
Member
Location: Colorado Join Date: Feb 2013
Posts: 12
|
![]()
The bowtie-inspect thing does get all the info out, but thats 3gb of info since I can't select a location
|
![]() |
![]() |
![]() |
#5 |
Senior Member
Location: Boston Join Date: Feb 2008
Posts: 693
|
![]()
Although bowtie index essentially keeps the genome, I doubt it is optimized or designed for your purpose. Use faidx if you only want to retrieve a few regions.
|
![]() |
![]() |
![]() |
#6 |
Member
Location: Colorado Join Date: Feb 2013
Posts: 12
|
![]()
I want to retrieve lots of regions efficiently, but thanks for pointing me to faidx, I'll see how it works.
|
![]() |
![]() |
![]() |
#7 |
Devon Ryan
Location: Freiburg, Germany Join Date: Jul 2011
Posts: 3,480
|
![]()
If you really have a LOT of positions, then it's best to read the genome into memory. samtools faidx is great for a smallish number of sites, but it grabs the sequence from disk, making it a bit slow for a large number of queries.
|
![]() |
![]() |
![]() |
#8 |
Member
Location: Colorado Join Date: Feb 2013
Posts: 12
|
![]()
yeah, I'm torn on holding it in memory or not. Toy with different workflows
|
![]() |
![]() |
![]() |
#9 | |
David Eccles (gringer)
Location: Wellington, New Zealand Join Date: May 2011
Posts: 838
|
![]() Quote:
http://genome.ucsc.edu/FAQ/FAQformat.html#format7 The code points to a way to retrieve ranges: http://genome-source.cse.ucsc.edu/gi...oBit.h;hb=HEAD Code:
/* Parse a .2bit file and sequence spec into an object. * The spec is a string in the form: * * file/path/input.2bit[:seqSpec1][,seqSpec2,...] * * where seqSpec is either * seqName * or * seqName:start-end edit: indeed, BLAT has such functions included. See here for a bit of discussion about 2bit retrieval using Perl: http://www.perlmonks.org/?node_id=672251 Last edited by gringer; 10-31-2013 at 04:43 PM. |
|
![]() |
![]() |
![]() |
#10 |
Member
Location: SE MN Join Date: Oct 2013
Posts: 44
|
![]() |
![]() |
![]() |
![]() |
Tags |
bowtie, bowtie index, chromosome, genome |
Thread Tools | |
|
|