SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
STAR + Cufflinks. Cufflinks hanging. BAM XS error? gogodidi Bioinformatics 3 12-20-2015 09:19 PM
GATK preprocessed bam files for cufflinks? ege RNA Sequencing 1 04-13-2014 09:00 PM
Can Cufflinks handle bam files from outside? thejustpark RNA Sequencing 0 12-03-2013 02:16 PM
Cufflinks refuses to operate on Tophat2 created bam or sam files due to sorting error amrezans Bioinformatics 1 06-24-2013 12:54 PM
cufflinks accepting BAM files as input??? PFS Bioinformatics 1 03-18-2011 11:56 AM

Reply
 
Thread Tools
Old 03-25-2015, 05:25 AM   #1
cbaudo
Member
 
Location: Missouri

Join Date: Jan 2013
Posts: 21
Default Can I use preproccesed bam files for Cufflinks?

This question has been asked here previously: http://seqanswers.com/forums/showthr...s+preprocessed

Similarly, I have trimmed, aligned and sorted using STAR, add read groups, and marked duplicates to my BAM file. I'm curious if this processed BAM file is suitable for input into Cufflinks or if I should use the initial aligned BAM file.

Secondly, if it is suitable, would it make sense to before local realignment around indels using GATK prior to Cufflinks?

Thank you for your time,
cb
cbaudo is offline   Reply With Quote
Old 03-25-2015, 05:41 AM   #2
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,479
Default

As long as you had STAR add the XS tags for cufflinks then the rest should be fine. I actually don't know whether cufflinks pays attention to the duplicate flag. I assume not, since marking duplicates in RNAseq data is typically not useful.

There's no reason to realign RNAseq reads before using cufflinks. You're not calling variants (and even there, local realignment is becoming questionable with more recent versions of GATK).
dpryan is offline   Reply With Quote
Old 03-25-2015, 05:45 AM   #3
cbaudo
Member
 
Location: Missouri

Join Date: Jan 2013
Posts: 21
Default

I've read conflicting information about marking duplicates, do you suggest that I don't perform this step? Note that I haven't removed the duplicated reads.

Also, when you say it's becoming questionable do you mean it doesn't produce a higher quality alignment or it can introduce errors?

Thanks for your advice.
cbaudo is offline   Reply With Quote
Old 03-25-2015, 05:54 AM   #4
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,479
Default

You absolutely should not mark duplicates. Any highly expressed gene/transcript will necessarily have many apparent duplicates that aren't actual PCR duplicates simply due to the gene/transcript being highly expressed.

I mean that it doesn't increase quality. If you go over to Brad Chapman's blog you'll find a very large number of comparisons of GATK/samtools/freeBayes/etc. with a variety of settings. It's increasingly the case the local realignment and also quality recalibration don't increase the quality of the output (they don't generally hurt it, but these steps can take a while).
dpryan is offline   Reply With Quote
Reply

Tags
cufflinks, gatk, preprocessing, rna-seq data analysis, star

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:42 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO