![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Want to use extract_genomic_dna in command line | louis7781x | Bioinformatics | 2 | 12-04-2011 06:51 AM |
Setting Bowtie options from the Tophat command line | GiladZil | RNA Sequencing | 2 | 08-02-2011 02:42 PM |
SAMtools command line ??? | Pawan Noel | Bioinformatics | 6 | 11-16-2010 11:42 AM |
Tophat options to report unaligned reads and controlling Bowtie options | Siva | Bioinformatics | 0 | 10-15-2010 08:38 PM |
SIFT on the command line | lamasmi | Bioinformatics | 2 | 08-17-2010 10:32 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: Michigan, US Join Date: Jun 2010
Posts: 2
|
![]()
Dear All,
I have been playing with tophat but am not certain which combination of command line options I should use. I hope people with more experiences can help me on that. The main purpose is the expression analysis and my reads are single ended with 80bp. 1. Is it better or not to provide an exon junction database? If such a database is provided, will tophat only map read against the exon junction database or it will also try the regular mapping if it fails to find a match in the database? If the answer it the latter, does that means it is always better to have such a junction database? 2. If it is better to provide such a database, where and which file to download? 3. For expression analysis should I used -g 1 to allow only unique mapping or use other values? The default is 40 and I have seen may reads reported multiple times in the sam output file. In this case, a read will be counted multiple times for different genes and this shouldn't be right. 4. The map quality scores reported in sam file by tophat have a lot of low values (e.g. >20% read have score below 20). Should I use some criteria to filter reads with bad mapping quality score? What are sensible numbers to use? 5. What other sensible parameters we should be paying attention to? For me I only used --microexon_search. Many many thanks! |
![]() |
![]() |
![]() |
#2 | |||
Member
Location: Pasadena, CA Join Date: May 2009
Posts: 45
|
![]() Quote:
Quote:
Quote:
|
|||
![]() |
![]() |
![]() |
#3 |
Member
Location: Institute, WV Join Date: May 2010
Posts: 24
|
![]()
Hi,
I wanted to post a new thread, but cant find "New Thread" button anywhere. My question is : Can Tophat handle csfasta files which are in color space format ? I think there is no command line option for doing this. In order to work with SOLiD csfasta files, do I have to first convert it into sequence space format? Could anyone direct me also how to post a new thread. Thanks, |
![]() |
![]() |
![]() |
#4 | |
Member
Location: Pasadena, CA Join Date: May 2009
Posts: 45
|
![]()
http://tophat.cbcb.umd.edu/manual.html
Quote:
|
|
![]() |
![]() |
![]() |
#5 | |
--Site Admin--
Location: SF Bay Area, CA, USA Join Date: Oct 2007
Posts: 1,358
|
![]() Quote:
|
|
![]() |
![]() |
![]() |
#6 | ||
Junior Member
Location: Michigan, US Join Date: Jun 2010
Posts: 2
|
![]() Quote:
Quote:
This is very interesting. Actually I am puzzled by the -g option. Suppose a read is mapped to 50 places on the genome and I specify -g 20. Will tophat (1) report only the first 20 alignments of this reads or (2) it does not report this read at all since 50 is greater than 20? It is not clear when I read the manual. If (1) is correct, then specifying a small number is better since multiple hits will over-count the expression. However if (2) is right, a lot of reads will be missed due to similarities among genes in a gene family. |
||
![]() |
![]() |
![]() |
#7 | ||
Member
Location: Pasadena, CA Join Date: May 2009
Posts: 45
|
![]() Quote:
Quote:
|
||
![]() |
![]() |
![]() |
Thread Tools | |
|
|