SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
reads simulator? nh3 Bioinformatics 2 03-22-2013 02:42 PM
Looking for the right WGS simulator oiiio Bioinformatics 5 07-20-2012 10:59 AM
Nexgen simulator. aloliveira General 1 02-14-2011 08:57 AM
Flux capacitor Haneko Bioinformatics 1 04-08-2010 12:35 AM

Reply
 
Thread Tools
Old 08-14-2013, 05:52 AM   #1
alittleboy
Member
 
Location: USA

Join Date: Apr 2011
Posts: 60
Default a question in using Flux Simulator

I am using Flux Simulator to simulate RNA-Seq experiment, and after running

flux-simulator -p /HS_simu1.par

I got the following error message:

[ERROR] Error while preparing sequences: Problems reading sequence GL000191.1: pos -1, len 171, check whether chromosomal sequence exists / has the correct size

Can I know which file I should check for chromosomal sequence? Thanks!
alittleboy is offline   Reply With Quote
Old 08-14-2013, 06:08 AM   #2
alexdobin
Senior Member
 
Location: NY

Join Date: Feb 2009
Posts: 161
Default

I have just had a similar problem with Flux Simulator - I think this problem is caused by a transcript (in your ENSEMBL(?) GTF) that belongs to GL000191.1 "non-chromosomal" contig, while you only provided Flux with "chromosomal" fasta files. The easiest way to solve it is to filter all the "non-chromosomal" annotations from the GTF, or you can add the "non-chromosomal" fasta files.
alexdobin is offline   Reply With Quote
Old 08-14-2013, 06:13 AM   #3
alittleboy
Member
 
Location: USA

Join Date: Apr 2011
Posts: 60
Default

Quote:
Originally Posted by alexdobin View Post
I have just had a similar problem with Flux Simulator - I think this problem is caused by a transcript (in your ENSEMBL(?) GTF) that belongs to GL000191.1 "non-chromosomal" contig, while you only provided Flux with "chromosomal" fasta files. The easiest way to solve it is to filter all the "non-chromosomal" annotations from the GTF, or you can add the "non-chromosomal" fasta files.
Thank you alexdobin for the suggestions! Can I know how you solved this problem (i.e. modifications of the GTF file...)?
alittleboy is offline   Reply With Quote
Old 08-14-2013, 01:38 PM   #4
alexdobin
Senior Member
 
Location: NY

Join Date: Feb 2009
Posts: 161
Default

Since I did not need annotations on non-chromosomal scaffolds, I simply removed all the non-chromosomal entries from the GTF file. For ENSEMBL GTF, the chromosomes are 1-22,X,Y,MT, so you could simply do:
grep ^[0-9XYM] ENSEMBL.gtf > ENSEMBL.chrOnly.gtf

On the other hand, if you want to simulate RNA-seq from non-chromosomal scaffolds, you will need to downolad the "nonchromosomal" fasta file from ENSEMBL (e.g. ftp://ftp.ensembl.org/pub/release-72...omosomal.fa.gz), split the file into separate fasta files for each of the scaffolds, and add these fasta files to the GEN_DIR directory.
alexdobin is offline   Reply With Quote
Old 08-15-2013, 02:04 AM   #5
CompBio
Member
 
Location: Bristol, UK

Join Date: Aug 2009
Posts: 26
Default

I've run into a similar problem with whole chromosomes. At the time I assumed the simulator may try to grab sequence past the end of a chromosome when there are genes near the end of a chromosome. My simple workaround was to pad the end of each chromosome sequence with enough N's to accommodate my desired read size.

It's possible the same solution would work with contigs/scaffolds if you don't want to eliminate them from your simulations.
CompBio is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:14 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO