Seqanswers Leaderboard Ad

**Wind** · 08-03-2009, 02:17 AM

Bowtie is a nice tool for short read alignment I think. However, I found a problem in pair-end data mapping. I produced 75bp reads by simulating Illumina's high-throughput sequencing, and aligned them to the reference sequence. By the way, only few alignments, less than 10, are reported. As 1300000 alignments are reported with non paired-end mapping, probably it is wrongly mapped I think.
My option is "bowtie -p 8 -a -y -X 650 human -1 reads_1.fa -2 reads_2.fa output.map".

Can anybody tell me what is the problem?

**Ben Langmead** · 08-03-2009, 05:12 AM

Originally posted by Wind View Post

Bowtie is a nice tool for short read alignment I think. However, I found a problem in pair-end data mapping. I produced 75bp reads by simulating Illumina's high-throughput sequencing, and aligned them to the reference sequence. By the way, only few alignments, less than 10, are reported. As 1300000 alignments are reported with non paired-end mapping, probably it is wrongly mapped I think.
My option is "bowtie -p 8 -a -y -X 650 human -1 reads_1.fa -2 reads_2.fa output.map".

This is probably due to the -I/--minins, -X/--maxins, and/or --fr/--rf/--ff options being set incorrectly. Please double-check the manual's description of those options and verify that your invocation matches the way you've simulated your reads. Also, make sure the simulated read files are formatted correctly, with all mates lining up properly.

Thanks,
Ben

**Wind** · 08-03-2009, 05:26 PM

Thanks

Hi Ben,

Thanks for your advice. There were many 'N's in simulated data, so that they may interrupt paired-mapping. I'll try with other data sets. Thanks.

**tianell** · 08-04-2009, 11:32 PM

Ben, help me..

Hi Ben,
I have a question for you about alignment result message.
When I align certain short reads to reference using Bowtie, can I get a result message related to none-matched case??

I could not find an option to get a such result message.

I want to report even if certain short reads are not aligned to reference in order to use this information(not aligned!).

I wil wait your answer, Ben. Thank you so much.

**Ben Langmead** · 08-05-2009, 07:15 AM

Hi tianell,

Originally posted by tianell View Post

When I align certain short reads to reference using Bowtie, can I get a result message related to none-matched case??

I could not find an option to get a such result message.

I want to report even if certain short reads are not aligned to reference in order to use this information(not aligned!).

Sorry, no, there is no option to print such a message. I'll add this as a feature request. In the meantime, it's quite easy to deduce that number either by using the --un/--max options (and then counting), or by subtracting the reported number from the number of input reads.

Thanks,
Ben

**joa_ds** · 08-05-2009, 07:17 AM

Isn't there a feature to export unmapped reads to a file?

I always run bowtie and export unmapped and repeats using

--unfq unaligned.fastq --maxfa duplicates.fastq

taking a look at the size of both files compared to your original file gives you an approx idea of % unaligned/repeats

**bioinfosm** · 08-06-2009, 11:32 AM

I wanted to discuss a use-case:
A collection of 172 million reads ranging from 36 to 76 base long was used with bowtie to map to a reference.

$ ./bowtie --best --un leftover -p 4 -t reference reads mapped
$ grep -c '^@' leftover
154828705
$ wc -l mapped
16269083 mapped

The total of leftover and mapped is less than what we started with. Are the remaining reads mapping to multiple locations, and thus omitted in both these files?

**Ben Langmead** · 08-06-2009, 11:52 AM

Hi boinfosm,

Originally posted by bioinfosm View Post

The total of leftover and mapped is less than what we started with. Are the remaining reads mapping to multiple locations, and thus omitted in both these files?

That shouldn't be the case. When only --un is used (as opposed to both --un and --max), both the unaligned reads and the reads with a number of alignments exceeding the -m limit will go into the --un file. But you're not using the -m option, so no reads should be suppressed due to multiple alignments.

How are you counting the number of reads in your input set? Note that grep -c '^@' isn't necessarily correct because quality strings can also start with @.

Thanks,
Ben

**bioinfosm** · 08-07-2009, 11:33 AM

thanks Ben.. the light bulb just flashed on me!

**davisc** · 08-19-2009, 08:54 AM

Question about RepeatMasked hg18 index

I'm doing RNA-Seq on human samples. In many instances I am mapping using the -m1 -v2 --best criteria to the preassembled hg18.asm index available on the download site. I would like to know how Bowtie handles N's in the indices? I am wondering if it is possible to cut down the mapping time by building and mapping against a repeatmasked version of the genome?

**Ben Langmead** · 08-19-2009, 09:05 AM

Originally posted by davisc View Post

I would like to know how Bowtie handles N's in the indices? I am wondering if it is possible to cut down the mapping time by building and mapping against a repeatmasked version of the genome?

When Bowtie indexes the reference, it elides non-A/C/G/T characters. So if you index a reference with stretches of Ns, Bowtie will never report an alignment spanning any of the stretches.

And yes, mapping against the repeatmasked version of the genome (and omitting -m 1) ought to be noticeably faster.

Ben

**ewilbanks** · 09-08-2009, 02:28 PM

Indexing human genome?

Hi!

I'm working on building an index of human genome locally and I was wondering how long this usually takes? Its been running for about 3 hrs, just wondering what to expect. I'm on a MAC dual core with 4GB ram.

Thanks!
Lizzy

**Ben Langmead** · 09-09-2009, 03:41 AM

Hi Lizzy,

I'd expect, oh, about 7-8 hours or so. Did it finish?

Thanks,
Ben

**Layla** · 09-09-2009, 07:44 AM

Im a newbie to Bowtie....tired of the counting down the hours using MAQ.

Currently building an index using Bowtie. What is the difference between
h_sapiens_asm.ebwt.zip and
h_sapiens.ebwt.zip

Thanks

L

**Ben Langmead** · 09-09-2009, 07:48 AM

Hi Layla,

h_sapiens indexes the NCBI human reference contigs and h_sapiens_asm indexes the NCBI human reference assembly. Take a look at the scripts/make_h_sapiens.sh and scripts/make_h_sapiens_asm.sh files distributed with Bowtie to see exactly what fasta files were indexed and how.

People often prefer the assembly because the coordinates output by bowtie are more immediately useful (e.g., they correspond to the hg18 coordinates in the Genome Browser).

Thanks,
Ben

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 24 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 25 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 21 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News