SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
compiling of cufflinks failed due to bamlib error kristianunger Bioinformatics 3 09-20-2016 03:52 PM
Installation error (vcftools) due to zlib? jullee Bioinformatics 2 12-02-2014 05:23 AM
velvetg stops due to memory shortage vanillasky Bioinformatics 17 05-15-2014 07:41 AM
resuming stopped process due to error. xplorgenes Bioinformatics 2 05-12-2010 08:16 AM

Reply
 
Thread Tools
Old 07-26-2015, 05:23 PM   #1
Him26
Member
 
Location: California US

Join Date: Aug 2011
Posts: 18
Default Is this segemehl error due to memory?

Hi I am fairly new to RNA-seq.
I am trying to analyze my data using segemehl but am running into following error. (I've cut and pasted the last part of the output.)
[SEGEMEHL] Fri Jul 24 19:16:53 2015: 1637977 reads in thread 0.
[SEGEMEHL] Fri Jul 24 19:16:53 2015: 1637824 reads in thread 1.
[SEGEMEHL] Fri Jul 24 19:16:53 2015: 1637824 reads in thread 2.
[SEGEMEHL] Fri Jul 24 19:16:53 2015: 1637824 reads in thread 3.
segemehl.x: libs/biofiles.c:1160: bl_fastxAddMate: Assertion `bl_fastaCheckMateID(f, n, descr, descrlen)' failed.

My job commend is
segemehl.x --silent -i hg19.idx -d human_hg19.fa -q READ1 -p READ2 -O -o sege.sam -u unmap.sam -D 1 -t 4

One of my question was if I submit the job by chromosome to reduce the memory load how can segemehl map reads that align to different chromosomes?

I read in some posting I should use the full reference file for but this will lead to significant increase in mapping time and memory requirement.
How do I find the right balance?

Thank you in advance
Him26 is offline   Reply With Quote
Old 07-26-2015, 10:56 PM   #2
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

There is no "right balance". You need to map to the full reference if you want correct results.

I can't advise you on that error message, but your command certainly looks strange. Is that the actual command, or are you substituting "READ1" and "READ2" for the filenames?

Last edited by Brian Bushnell; 07-27-2015 at 10:30 AM.
Brian Bushnell is offline   Reply With Quote
Old 07-27-2015, 01:41 AM   #3
ecSeq Bioinformatics
Senior Member
 
Location: Leipzig, Germany

Join Date: May 2012
Posts: 221
Default

If this error occurs, segemehl cannot assign mate2 to mate1. Are the reads in both your files in correct order? Do they have matching read ids (at least the beginning of the id)? Do you have the same number of reads in the mate1 file and the mate2 file?

If you did adapter clipping and/or quality trimming, assure that you do it for both files together and not separated in two calls. You can use bbduk to trim paired-end reads without loosing the mate1-mate2-connection.
__________________
ecSeq Bioinformatics is Europe’s leading provider of hands-on bioinformatics workshops and professional data analysis in the field of Next-Generation Sequencing (NGS).
ecSeq Bioinformatics is offline   Reply With Quote
Old 07-27-2015, 05:46 PM   #4
Him26
Member
 
Location: California US

Join Date: Aug 2011
Posts: 18
Default

Thank you for the reply.

Brian Bushnell : Yes the READ1 and READ2 are being substituted with actual fastq file names.
Are there more strange things you could find in my commend? please let me know.

ecSeq Bioinformatics : I was using a Alientrimmer and I believe it does not do read ID matching. I am sure that is the problem.
Thank you.
Him26 is offline   Reply With Quote
Old 10-15-2015, 07:41 AM   #5
lxsj3
Junior Member
 
Location: China

Join Date: Oct 2015
Posts: 1
Default

Hi Him26,
Have you solved the problem? I'm using segemehl and meet the problem too. I don't do any trimming to my fastq file and I have checked that the reads in both my files are in correct order. I really appreciate any help.
Thank you.
lxsj3 is offline   Reply With Quote
Old 02-28-2016, 08:36 PM   #6
Him26
Member
 
Location: California US

Join Date: Aug 2011
Posts: 18
Default Nope

I got caught up with other issue and have not followed up on this matter. sorry about this. Do let me know if you find out anything.
Him26 is offline   Reply With Quote
Old 03-03-2016, 04:14 AM   #7
ecSeq Bioinformatics
Senior Member
 
Location: Leipzig, Germany

Join Date: May 2012
Posts: 221
Default

Segemehl tries to find the two mates that belong together by checking the fastq identifiers.

They have to be:
  1. completely identical,
  2. contain identical substring (everything before the first whitespace), or
  3. identical with a '/1', or a '/2' at their ends
__________________
ecSeq Bioinformatics is Europe’s leading provider of hands-on bioinformatics workshops and professional data analysis in the field of Next-Generation Sequencing (NGS).
ecSeq Bioinformatics is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:42 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO