Seqanswers Leaderboard Ad

**songyj** · 10-19-2011, 05:56 PM

can anybody help..?

T_T

**songyj** · 10-21-2011, 06:18 AM

please ... ...

**stanwish** · 10-21-2011, 02:08 PM

Hi, I am also a beginer struggling with TOPHAT & Cufflinks and I tried to answer your questions but I am not quite sure...
For question 1, I agree with your ideas on the segment length. But I run the segment length of 17 on my last mapping so I am not clear whether 25 is a minium. While the anchor lenght refers to the number of bp at the splice junctions. If the anchor length is 8, that means if the reads have 7 bp on one exon and the other 18 on the other exon, it might be discard...

For question 2, I think it would be better to use other software to construct the genome-independent reconstruction based on your seq-result such as Velvet.

Q3, have you run a fastqc or something like that to quickly check your reads quality? Usually it will introduce some mismatch or low quality data from the sequencing and if you set the tolerance low it will discard the mismatch reads.

I am not clear about Q4...

**songyj** · 10-26-2011, 12:34 AM

[QUOTE=stanwish;54637]
For question 2, I think it would be better to use other software to construct the genome-independent reconstruction based on your seq-result such as Velvet.

Q3, have you run a fastqc or something like that to quickly check your reads quality? Usually it will introduce some mismatch or low quality data from the sequencing and if you set the tolerance low it will discard the mismatch reads.
QUOTE]

Q2: No matter how many times，the result from using Tophat with annotation will more when other parameters same.

Q3: the number of reads have been filtered out seems has nothing to do with
parameters choose,so I think it because the read’s quality is too low

**dvanic** · 02-21-2012, 10:31 PM

Question 3:
In file “tophat/logs/prep_reads” it reads “6975 out of 28036024 reads have been filtered out”. What is the reason to filter the reads? Is it because the read’s quality is too low or the read can’t mapped genome?

Also have this issue - why is tophat filtering reads? Based on what criteria?
And I'm assuming this is happening before the mapping?

What I've done:
I've used SolexaQA to trim my reads to be of decent quality (albeit some are shorter than others, with median read lengths of 75-100 bp in all of my datasets, with the first mate being median 98-100, and the pair - 75-85. Mean read lengths are 83-84 and 68-71). However, when I look at the log file I find that 1-2% of my reads end up being discarded by tophat...

While it could be that some of my reads are just very short and hence discarded by tophat, I'd like to understand a bit better what exactly is going on here...

Thanks in advance!

**kreitinger** · 08-29-2012, 10:56 AM

Tophat kept versus discarded reads

I am also curious about how Tophat decides (during the run) to keep or discard reads. For example, I am using Tophat to analyze ~80 million reads, and during the job, I see that Tophat has kept 80 million reads, while discarding about 10-500K reads. Why is this?

**carmeyeii** · 08-29-2012, 11:39 AM

Could it be your multi-mapping parameter?

**kreitinger** · 09-05-2012, 06:09 AM

I have started a new thread that recapitulates this kept/discarded question.

About the multi-mapping parameter, I don't think that may be it: The kept/discarded reads, however, are calculated very early in the run and happens much faster (few minutes) than the time it takes to map all the reads. I am thinking it is more likely a fast QC-related filter?

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 18 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 22 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 17 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 49 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Some questions about running tophat & cufflinks

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News