SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
XLOC identifiers from cufflinks/cuffmerge/cuffdiff AdamB RNA Sequencing 4 01-24-2016 07:12 PM
error in Cufflinks v2.0.2 when using Cuffmerge mattanswers Bioinformatics 2 07-11-2014 02:40 PM
ERROR: cufflinks, cuffmerge, cuffdiff syintel87 Bioinformatics 0 01-19-2013 07:20 AM
Total RNAseq and cufflinks/cuffmerge/cuffdiff/cummeRbund pipeline JWC Bioinformatics 8 08-01-2012 05:41 AM

Reply
 
Thread Tools
Old 03-05-2015, 04:33 AM   #1
super0925
Senior Member
 
Location: UK

Join Date: Feb 2014
Posts: 206
Default Cufflinks/Cuffmerge/Cuffdiff error 'subsequence cannot be larger than 16338'

Hi all
I'm doing DE analysis on cow samples.
I use the reference genome of Ensembl UMD3.1, which I thought is the latest version.
However, when I ran Tophat2-Cuffdiff2 pipeline (default parameter setting), I still get this warning:

Warning: couldn't find fasta record for 'GJ058256.1'!
This contig will not be bias corrected.
Warning: couldn't find fasta record for 'GJ058424.1'!
......



(1) What does it mean? Is it very trouble for my downstream analysis?

And if I ran Tophat2-Cufflinks/Cuffmerge-Cuffdiff2 ((default parameter setting)), I got the error:
Warning: couldn't find fasta record for 'GJ060129.1'!
This contig will not be bias corrected.
......

Error (GFaSeqGet): subsequence cannot be larger than 16338
Error getting subseq for TCONS_00062149 (1..16448)!


(2) Why could I get this result? How can I go on with Tophat2-Cufflinks/Cuffmerge-Cuffdiff2?


Thank you !

Last edited by super0925; 03-05-2015 at 07:39 AM.
super0925 is offline   Reply With Quote
Old 03-05-2015, 05:41 AM   #2
super0925
Senior Member
 
Location: UK

Join Date: Feb 2014
Posts: 206
Default Cufflinks/Cuffmerge/Cuffdiff error 'subsequence cannot be larger than 16338'

Hi all
I'm doing DE analysis on cow samples.
I use the reference genome of Ensembl UMD3.1, which I thought is the latest version.
However, when I ran Tophat2-Cuffdiff2 pipeline (default parameter setting), I still get this warning:


Warning: couldn't find fasta record for 'GJ058256.1'!
This contig will not be bias corrected.
Warning: couldn't find fasta record for 'GJ058424.1'!
......



(1) What does it mean? Is it very trouble for my downstream analysis?

And if I ran Tophat2-Cufflinks/Cuffmerge-Cuffdiff2 ((default parameter setting)), I got the error:

Warning: couldn't find fasta record for 'GJ060129.1'!
This contig will not be bias corrected.

......

Error (GFaSeqGet): subsequence cannot be larger than 16338
Error getting subseq for TCONS_00062149 (1..16448)!


(2) Why could I get this result? How can I go on with Tophat2-Cufflinks/Cuffmerge-Cuffdiff2?


Thank you !

Last edited by super0925; 03-05-2015 at 07:40 AM.
super0925 is offline   Reply With Quote
Old 03-05-2015, 06:10 AM   #3
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,978
Default

I don't have the cow iGenomes set but my guess is that "GJ058424.1" is in the GTF file but is not in the genome sequence file. It appears to be SH3YL1 gene now.

Someone else will need to comment on the other error.
GenoMax is offline   Reply With Quote
Old 03-05-2015, 07:40 AM   #4
super0925
Senior Member
 
Location: UK

Join Date: Feb 2014
Posts: 206
Default

Quote:
Originally Posted by GenoMax View Post
I don't have the cow iGenomes set but my guess is that "GJ058424.1" is in the GTF file but is not in the genome sequence file. It appears to be SH3YL1 gene now.

Someone else will need to comment on the other error.
Why do I get this warning in (1) and error in (2)?
Is warning in (1) very critical for downstream analysis?
How to solve it?
Cheers
super0925 is offline   Reply With Quote
Old 05-12-2015, 11:12 AM   #5
super0925
Senior Member
 
Location: UK

Join Date: Feb 2014
Posts: 206
Default

Anyone could help?

GTF file are genes.gtf from UMD3.1

The first column at genes.gtf (I think it is chromosome) is
1
10
11
12
13
14
15
16
17
18
19
2
20
21
22
23
24
25
26
27
28
29
3
4
5
6
7
8
9
GJ058256.1
GJ058424.1
GJ058425.1
GJ058430.1
GJ058433.1
GJ058437.1
GJ058729.1
GJ059463.1
GJ059486.1
GJ059509.1
GJ059556.1
GJ059670.1
GJ060027.1
GJ060032.1
GJ060118.1
GJ060120.1
GJ060129.1
MT
X

Last edited by super0925; 05-12-2015 at 11:17 AM.
super0925 is offline   Reply With Quote
Old 05-12-2015, 11:17 AM   #6
super0925
Senior Member
 
Location: UK

Join Date: Feb 2014
Posts: 206
Default

Anyone could help?

GTF file are genes.gtf from UMD3.1

The first column at genes.gtf (I think it is chromosome) is
1
10
11
12
13
14
15
16
17
18
19
2
20
21
22
23
24
25
26
27
28
29
3
4
5
6
7
8
9
GJ058256.1
GJ058424.1
GJ058425.1
GJ058430.1
GJ058433.1
GJ058437.1
GJ058729.1
GJ059463.1
GJ059486.1
GJ059509.1
GJ059556.1
GJ059670.1
GJ060027.1
GJ060032.1
GJ060118.1
GJ060120.1
GJ060129.1
MT
X
super0925 is offline   Reply With Quote
Old 05-12-2015, 11:43 AM   #7
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,978
Default

Did you get these files from iGenomes or is this something you put together by getting files (seq, annotation etc) from individual sources?
GenoMax is offline   Reply With Quote
Old 05-12-2015, 01:50 PM   #8
super0925
Senior Member
 
Location: UK

Join Date: Feb 2014
Posts: 206
Default

Quote:
Originally Posted by GenoMax View Post
Did you get these files from iGenomes or is this something you put together by getting files (seq, annotation etc) from individual sources?
I downloaded from iGenome...
Do you mean my files is abnormal?
super0925 is offline   Reply With Quote
Old 05-12-2015, 02:02 PM   #9
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,978
Default

Quote:
Originally Posted by super0925 View Post
I downloaded from iGenome...
Do you mean my files is abnormal?
No. One of the reasons to get this data from iGenomes is it has (supposedly) been checked for consistency so the kind of thing you have run into does not happen. It is possible that you may have downloaded a flawed version that has since been fixed (you could download a new copy and compare).

I hesitate to recommend that you get sequences of missing fasta from NCBI and append them to your genome.fa file (you will likely need to re-index it again). But this may get you past one of the errors.

I am not sure how much work you have put into this already but if the new download from iGenomes does have these sequences then you could use that genome.fa file.

As for your second error this thread seems to have some options: https://www.biostars.org/p/57249/
GenoMax is offline   Reply With Quote
Old 06-26-2015, 01:35 AM   #10
karimhasanpur@yahoo.com
Junior Member
 
Location: Iran

Join Date: Nov 2013
Posts: 4
Default cufflinks warnings: could not find fasta records

Dar friends,

I am getting the same warnings. I have downloaded Galgal4 reference files from iGenome. When running cufflinks, I am getting "warning: couldn't find fasta record for LGE64 ...". I think these are contigs that are present in genes.gtf but not in the genome.fasta. My question is: could these warnings affect my downstream analyses? If so, what should I do to resolve these problem?

any comment would be appreciated
Karim
karimhasanpur@yahoo.com is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:29 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO