SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Ensembl/NCBI/UCSC mouse gene annotations for cufflinks sp144 Bioinformatics 13 12-10-2013 12:53 AM
cd-hit output clarification sivasubramani Bioinformatics 1 09-29-2013 11:35 PM
How to use Ensembl or Gencode annotation for cufflinks? metheuse RNA Sequencing 0 05-02-2013 05:39 AM
wgsim_eval output clarification nhvanlie Bioinformatics 0 02-04-2013 12:06 PM
Clarification of edgeR logFC mlkerber Bioinformatics 2 09-06-2012 05:45 AM

Reply
 
Thread Tools
Old 03-05-2014, 06:06 PM   #1
sindrle
Senior Member
 
Location: Norway

Join Date: Aug 2013
Posts: 266
Default Clarification on Tohpat/Cufflinks with Ensembl

Im currently analysing some mice samples and wanted to try the Ensembl annotation, I have only used UCSC before.

This is my tophat cmd:

tophat2 -o path/to --transcriptome-index=/Mus_musculus_Ensembl_NCBIM37/Mus_musculus/Ensembl/NCBIM37/Annotation/Genes/genes /Mus_musculus_Ensembl_NCBIM37/Mus_musculus/Ensembl/NCBIM37/Sequence/Bowtie2Index/genome 001.fastq.gz

this is my cufflinks cmd:

cufflinks --output-dir path/to --GTF-guide /Mus_musculus_Ensembl_NCBIM37/Mus_musculus/Ensembl/NCBIM37/Annotation/Archives/archive-2013-03-06-18-55-12/Genes/genes.gtf --frag-bias-correct /Mus_musculus_Ensembl_NCBIM37/Mus_musculus/Ensembl/NCBIM37/Sequence/Bowtie2Index/genome.fa --multi-read-correct --upper-quartile-norm --verbose accepted_hits.bam

What worrying?

First:

20: MT
10: 18
11: 19
4: 12
5: 13
2: 10
3: 11
8: 16
9: 17
6: 14
7: 15
22: Y
21: X
19: 9
18: 8
17: 7
16: 6
15: 5
14: 4
13: 3
12: 2
1: 1

Does it seam correct for chromosome names?

Second, maybe 1000 warning like this:

GFF warning: merging adjacent/overlapping segments of ENSMUST00000101224 on 6 (84939386-84939418, 84939421-84939523)

What does that mean?

Third, what the meaning of:

Warning: intron not within scaffold ([102700571-102700733], 0)


Thank you very much!
sindrle is offline   Reply With Quote
Old 03-05-2014, 06:54 PM   #2
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

It would be very helpful if you could first explain what kind of input data you have, and what you are trying to accomplish. Also, "path/to" is supposed to be replaced with the path to something - you shouldn't actually specify the literal "path/to".
Brian Bushnell is offline   Reply With Quote
Old 03-05-2014, 07:10 PM   #3
sindrle
Senior Member
 
Location: Norway

Join Date: Aug 2013
Posts: 266
Default

Quote:
Originally Posted by Brian Bushnell View Post
It would be very helpful if you could first explain what kind of input data you have, and what you are trying to accomplish. Also, "path/to" is supposed to be replaced with the path to something - you shouldn't actually specify the literal "path/to".
The input data is Illumina HiSeq2000 fastq of mouse tissue.

First, Im trying to align the read to the Ensembl mouse genome, but I want be sure its going well.

Cufflinks will be run on all my samples, followed by Cuffmerge. I want to look at expression levels of all genes, and novel transcripts. But also here I need to be sure everything goes as planned.

The "path/to" is simply there not to reveal my true name etc. which is in the path.

Thank you for quick reply!
sindrle is offline   Reply With Quote
Old 03-05-2014, 07:55 PM   #4
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

sindrle,

I am not involved with Bowtie/Tophat/Cufflinks, so I can not offer you direct advice. Hopefully the authors will get back to you!

But if you want to use BBMap, which is faster and more accurate than TopHat, I will be more than happy to help. It produces TopHat-compatible output, with all of the special tags required by Cufflinks.

I'd like to mention, though, that I've recently been dismayed to learn that Cufflinks appears to be incapable of distinguishing between a human male and female using X/Y-specific RNA expression... so I'm looking into alternatives.

Last edited by Brian Bushnell; 03-05-2014 at 08:00 PM.
Brian Bushnell is offline   Reply With Quote
Old 03-17-2014, 02:59 AM   #5
sindrle
Senior Member
 
Location: Norway

Join Date: Aug 2013
Posts: 266
Default

This is with UCSC

20: MT
10: 18
11: 19
4: 12
5: 13
2: 10
3: 11
8: 16
9: 17
6: 14
7: 15
22: Y
21: X
19: 9
18: 8
17: 7
16: 6
15: 5
14: 4
13: 3
12: 2
1: 1


Its the same, so I guess its correct. Confusing name change, but whatever.
sindrle is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:44 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO