SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Convert merged BAM back to per lane BAM or FASTQ file danielsbrewer Bioinformatics 6 10-03-2013 08:29 AM
Trim BAM file reads from 5' ends FiReaNG3L Bioinformatics 3 05-25-2012 05:01 AM
Picard's SortSam error with a merged bam file tomato2 Bioinformatics 1 11-01-2011 10:18 PM
velveth assembly with single and paired ends Apexy RNA Sequencing 0 08-05-2011 09:41 AM
BOth single and paired end reads in a file!! adgen Illumina/Solexa 0 06-30-2010 11:28 AM

Reply
 
Thread Tools
Old 03-27-2014, 02:32 AM   #1
RocheKermit
Member
 
Location: Luxembourg

Join Date: Nov 2011
Posts: 15
Default Counts on single and paired ends reads merged bam file

Dear all,
We have made experiments either paired-end AND single-end on the same sample. Next, the 2 corresponding BAM files has been merged, making some difficulties for htseq-count ('pair_alignments' needs a sequence of paired-end alignments).
A fix is to use the 2 UNmerged BAM files with htseq-count separately, followed by a merge of the results files (simply summing the counts paired-end + single-end).
Is that approach healthy and recommended?
Best regards.
PS: HTSeq rocks!
RocheKermit is offline   Reply With Quote
Old 03-27-2014, 02:50 AM   #2
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,476
Default

You might also consider just keeping the single-end and paired-end separate and then using that as a blocking factor in your experimental design. Having said that, if the library-type effect is minimal (as indicated by PCA, clustering, etc.), then you might as well go ahead and sum things...but I'd check that the results are similar enough first.
dpryan is offline   Reply With Quote
Old 03-27-2014, 04:26 AM   #3
RocheKermit
Member
 
Location: Luxembourg

Join Date: Nov 2011
Posts: 15
Default

Right.
I need to update my question, and actually the experiment has been conducted with only paired-end BUT the read quality filtering and trimming downstream steps conducted to the removal of some mate pairs (approx. 10% of pairs become single/orphan).
So the mixture of single and paired read doesn't come from the biochemistry, but rather from QC filtering.
We can state that it's healthy to merge back the paired and single in HTSeqc-count, isn't it? Qualitatively speaking et least.
About the counting, on one hand 1 mapped single read conduct to one count, on the other hand, a paired read will also be counted only once. Don't we overweight the single-end reads by simply summing single + paired end reads?
RocheKermit is offline   Reply With Quote
Old 03-27-2014, 05:10 AM   #4
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,476
Default

A single read and a pair both describe the position of the fragment that was sequenced. In both cases, you can consider that it's actually the fragment that's getting counted, so then nothing is being given undue weight. The only real objection to that is that single-end reads don't give you the full bounds, so there are cases where they'll lead to slightly inflated counts (e.g., when the other end of the fragment actually overlaps a different feature, but you have no way of knowing this), but the effect of that is likely quite small (again, you could judge this by clustering things).
dpryan is offline   Reply With Quote
Old 03-27-2014, 07:35 AM   #5
RocheKermit
Member
 
Location: Luxembourg

Join Date: Nov 2011
Posts: 15
Default

Great answer!
I agree with that and I'll go on with the proposed strategy, which is the following:
1. QC of fastq files, trimming ..
2. Alignment of single read, alignment of paired reads
3. HTSeq-count of single reads, HTSeq-count of paired reads
4. Sum of counts for each gene of single reads + paired reads
5. Happy EdgeR or DESeq or whatever...

Thank you so much for your help!!
RocheKermit is offline   Reply With Quote
Reply

Tags
htseq, htseq-count, merged bam, pair-end single-end, sam

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:36 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO