Seqanswers Leaderboard Ad

**chadn737** · 02-14-2013, 08:42 AM

Are you mapping your reads first to one and then the other or at the same time? Ideally it shouldn't make a differences. The way you described it, where you map with tophat to human then got fewer reads with bowtie2 mapping to bacterial genomes makes me wonder if you are not mapping some of the bacterial reads to the human genome? Its similar to the problem of mapping reads to only part of the genome, rather than the whole genome. Tophat, bowtie2, or any tool will try to map the read no matter what. Maybe a read is genuinely from one genome, but if that genome is absent, it will settle for the best it can get from the reference you give it. Maybe combine your two references, map to both simultaneously, and see what results.

**bob-loblaw** · 02-14-2013, 09:28 AM

Originally posted by chadn737 View Post

Are you mapping your reads first to one and then the other or at the same time? Ideally it shouldn't make a differences. The way you described it, where you map with tophat to human then got fewer reads with bowtie2 mapping to bacterial genomes makes me wonder if you are not mapping some of the bacterial reads to the human genome? Its similar to the problem of mapping reads to only part of the genome, rather than the whole genome. Tophat, bowtie2, or any tool will try to map the read no matter what. Maybe a read is genuinely from one genome, but if that genome is absent, it will settle for the best it can get from the reference you give it. Maybe combine your two references, map to both simultaneously, and see what results.

First to one, then to another. I had thought about this before, but when building the bacterial database we hit the max size of a reference database or and index that bowtie2 can build (well that's what I've been told, it was built just before I started this project). This is defiantly something to look into though, thanks!

If I was going to be mapping both human and bacterial simultaneously, we'd have to use tophat in order to efficiently map the human reads (human reads comprise a large amount of the reads in our samples), do you (or anyone else who see's this post) know how using tophat to map bacterial reads would work out? since tophat was designed to look for spliced reads?

**chadn737** · 02-14-2013, 02:11 PM

The size limit on the index is a problem. You could go ahead and combine them and see if what they told you was true. If it is you will only get an error message.

As for using Tophat on bacterial reads. Tophat will try to align reads first to the genome before looking for splicing. Ideally, all the bacterial reads will align to the bacterial genome in this first round and not be splice. I won't say that wont happen, because inevitably some will have some sort of mismatch and show up spliced.

Have you tried aligning reads to the bacterial genome and then to the human? Or has it only been human than bacterial?

**bob-loblaw** · 02-15-2013, 02:14 AM

Originally posted by chadn737 View Post

The size limit on the index is a problem. You could go ahead and combine them and see if what they told you was true. If it is you will only get an error message.

As for using Tophat on bacterial reads. Tophat will try to align reads first to the genome before looking for splicing. Ideally, all the bacterial reads will align to the bacterial genome in this first round and not be splice. I won't say that wont happen, because inevitably some will have some sort of mismatch and show up spliced.

Have you tried aligning reads to the bacterial genome and then to the human? Or has it only been human than bacterial?

I haven't tried aligning reads to the bacterial genome then to human, but originally we were using bowtie2 to map human reads (which only mapped a few thousand reads per file compared to the tens of millions that tophat mapped for the same file). Then when we did the bowtie2 to map bacterial reads we got about 5 or 10 times as many bacterial reads being mapped as we did when we used tophat to align human reads. (So few human reads were being aligned by bowtie2 it gives me an indication of what doing bowtie2 for bacterial reads before tophat for human would result in). Basically I think no matter which alignment we do first we'll have the same problem, that if bacterial goes first then we'll get a lot of false positives, and vice versa for if human goes first. Thanks for all your help here!

I'll defiantly be trying a tophat run with a database of both human and bacterial as soon as I can!

Finally if I could ask you one more question, what about the no discordant options that I mentioned in the OP? Do you think I use that parameter when running tophat? Or should I just go with the default settings?

**bob-loblaw** · 02-15-2013, 03:22 AM

Another problem that just popped into my head, if tophat tries to align all reads first without looking for splicing, won't I just have the same problem as before that a lot of human reads will be falsely identified as being bacterial? Or do you know if Tophat will first try to align everything without splicing, then with splicing and only return the best hit?

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Today, 08:47 AM	0 responses 11 views 0 likes	Last Post by seqadmin Today, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

Optimizing tophat mapping for mixed RNA-Seq data

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News