SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
How To: Contig to Scaffold? fahmida Bioinformatics 3 08-22-2013 04:41 AM
scaffold and contig chengeng Bioinformatics 5 07-04-2013 06:15 AM
How to get all contig boundaries from a sorted bam file dustar1986 Bioinformatics 3 09-30-2011 12:31 AM
Handling alignments spanning chromosomes boundaries in BWA nirav99 Bioinformatics 13 08-14-2011 12:48 AM
Help:scaffold 2 contigs! rogerholmes.novogene Bioinformatics 11 07-28-2011 10:05 PM

Reply
 
Thread Tools
Old 01-19-2012, 01:54 AM   #1
lukas1848
Member
 
Location: Germany

Join Date: Jun 2011
Posts: 54
Question reads mapping across scaffold boundaries

Hi,

I am struggling with assembling a transcriptome with SE 454 reads. I used bwa-sw to map the reads and want to use cufflinks for the assembly. Eventually I get errors like these
Quote:
>Error (GFaSeqGet): end coordinate (994) cannot be larger than sequence length 883
>Error (GFaSeqGet): subsequence cannot be larger than 472
>Error getting subseq for CUFF.18.1 (13..673)!
Here's what I think is happening:
I have about 12000 scaffolds, most of which are tiny. I have reads that map beyond the end of a scaffold, meaning that a part of the read is not aligned to the scaffold but "hangs over".

read -----------------
scaffold: |----------------------|

If I then try to extract the transcript with cufflinks, I get the error, because Cufflinks uses the reference genome to extract the exons instead of the assembled reads.

This now raised three questions for me:
1. Why do I have quite a number of reads mapping to tiny scaffolds. Assuming that these scaffolds are repetitive DNA or junk or whatever excluded them from being assembled in a bigger scaffold.
2. Is there a way to extract the transcripts with a different software?
3. Or should I try to exclude either all small scaffolds or all reads mapping beyond scaffold ends from my transcriptome assembly?
lukas1848 is offline   Reply With Quote
Old 03-21-2012, 05:03 PM   #2
k-gun12
Member
 
Location: NJ

Join Date: Feb 2010
Posts: 54
Default

I've got the same issue here... anyone found a solution yet?
k-gun12 is offline   Reply With Quote
Old 03-28-2012, 08:26 AM   #3
peromhc
Senior Member
 
Location: Durham, NH

Join Date: Sep 2009
Posts: 108
Default

of note, you can eliminate this error by NOT using the -b option in cufflinks.. Also, this bug is exposed in bowtie2 (https://sourceforge.net/tracker/?fun...7&atid=1101606), but not in bowtie1, at least I've never had it happen with B1..
peromhc is offline   Reply With Quote
Old 01-30-2013, 10:15 AM   #4
rpauly
Member
 
Location: Atlanta

Join Date: Apr 2011
Posts: 32
Default

Error (GFaSeqGet): subsequence cannot be larger than 16571
Error getting subseq for CUFF.42374.1 (2..16614)!

I am getting the same error....has anybody found a solution?? -b with or without does not work for me
rpauly is offline   Reply With Quote
Old 01-30-2013, 12:00 PM   #5
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default

You could pad all your references with Ns so that nothing hangs off.
swbarnes2 is offline   Reply With Quote
Old 01-30-2013, 01:15 PM   #6
rpauly
Member
 
Location: Atlanta

Join Date: Apr 2011
Posts: 32
Default

Forgive my ignorance...but how do I do pad those sequences?
rpauly is offline   Reply With Quote
Old 05-30-2016, 11:13 PM   #7
super0925
Senior Member
 
Location: UK

Join Date: Feb 2014
Posts: 206
Default

Hi I have same problem. It seems that lots of people have this error.

I use the Ensembl genome(*dna.toplevel.fa) and GTF file.

If I ran cufflinks (even without -b genome.fa) but then run Cuffmerge with -s genome.fa, I got the error!

If I omitted -b genome.fa in both Cufflinks and Cuffdiff and meanwhile omit -s genome.fa in Cuffmerge, I didn't get the error! However, I don't know if it was accurate.
super0925 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:43 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO