SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
Differential Expression for leukemia? ymc RNA Sequencing 4 12-05-2013 01:16 AM
Expression quantification/differential expression gene analysis by RNA-Seq chenjy Bioinformatics 12 08-02-2013 03:06 AM
Differential expression graphics Chuckytah Bioinformatics 10 06-18-2011 01:34 PM
differential expression for de novo papori De novo discovery 2 05-26-2011 08:12 AM
Differential expression noe Bioinformatics 0 07-07-2010 04:16 PM

Reply
 
Thread Tools
Old 08-13-2014, 08:26 AM   #1
chris k
Junior Member
 
Location: Germany

Join Date: Aug 2014
Posts: 1
Default Help with trinity differential expression

Dear seqanswers users,

I am working with RNAseq data from a Illumina HiSeq and I like to perform a comparative analysis using trinity embedded perl scripts with edgeR. The RNA was prepped as TrueSeq library (newest version whatever that is right now).

To do that I need a estimation of the abundance of HiSeq reads in comparison to the reference transcriptome (which I created ab-initio using trinity version 2014/07/17). For this approach I use bowtie2 (version 2.2.3
64-bit) for the inital mapping step and samtools (version 0.2.0-rc12-1-gbbe85a9, 64bit self-compiled).

During a follow-up step the script from trinity/util "rsem-run-em" is used which throws the following error:

rsem-run-em: QualDist.h:39: int QualDist::c2q(char): Assertion `c >= 33 && c <= 126' failed.

of course that error infers that some sequences in my 22Gb reads have quality scores which are out of the range of ASCII dec 33 (equalling !) to 126 (equalling ~). By simple grep analysis for non-ASCII characters between 33 to 126 I could not get any lines out of the fastq read files which would contain such "illegal" characters. Form that I infer the problem lies within the bam files I got from bowtie2.

So, I used samtools with the following command to get rid of all lines which would not have acceptable quality-scores for the phred33 quality column in the bam file by using the following command:

samtools view -bq 2 bowtie2.bam > bowtie2.filtered.bam

But still the same error persists.

So to my question: Does anybody have a clue how one could identify the single "wrong" entry in a 2,3Gb bam file to get rid of low quality sequence entries with illegal characters?

Any help is highly appreciated.

Chris

PS: For further information I run a i5 intel machine (64bit) and use ubuntu 14.04 LTS.
chris k is offline   Reply With Quote
Old 08-13-2014, 09:36 AM   #2
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,478
Default

Perhaps there's a space or extraneous line ending somewhere. If you're familiar with running things in a debugger, you might be able to diagnose the cause that way. If not, you can always subset the file to determine exactly where the problem line/character is (this can become very time consuming).
dpryan is offline   Reply With Quote
Reply

Tags
bam, bowtie2, sam, samtools, trinity

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:07 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO