![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Convert merged BAM back to per lane BAM or FASTQ file | danielsbrewer | Bioinformatics | 6 | 10-03-2013 08:29 AM |
Trim BAM file reads from 5' ends | FiReaNG3L | Bioinformatics | 3 | 05-25-2012 05:01 AM |
Picard's SortSam error with a merged bam file | tomato2 | Bioinformatics | 1 | 11-01-2011 10:18 PM |
velveth assembly with single and paired ends | Apexy | RNA Sequencing | 0 | 08-05-2011 09:41 AM |
BOth single and paired end reads in a file!! | adgen | Illumina/Solexa | 0 | 06-30-2010 11:28 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Luxembourg Join Date: Nov 2011
Posts: 15
|
![]()
Dear all,
We have made experiments either paired-end AND single-end on the same sample. Next, the 2 corresponding BAM files has been merged, making some difficulties for htseq-count ('pair_alignments' needs a sequence of paired-end alignments). A fix is to use the 2 UNmerged BAM files with htseq-count separately, followed by a merge of the results files (simply summing the counts paired-end + single-end). Is that approach healthy and recommended? Best regards. PS: HTSeq rocks! |
![]() |
![]() |
![]() |
#2 |
Devon Ryan
Location: Freiburg, Germany Join Date: Jul 2011
Posts: 3,480
|
![]()
You might also consider just keeping the single-end and paired-end separate and then using that as a blocking factor in your experimental design. Having said that, if the library-type effect is minimal (as indicated by PCA, clustering, etc.), then you might as well go ahead and sum things...but I'd check that the results are similar enough first.
|
![]() |
![]() |
![]() |
#3 |
Member
Location: Luxembourg Join Date: Nov 2011
Posts: 15
|
![]()
Right.
I need to update my question, and actually the experiment has been conducted with only paired-end BUT the read quality filtering and trimming downstream steps conducted to the removal of some mate pairs (approx. 10% of pairs become single/orphan). So the mixture of single and paired read doesn't come from the biochemistry, but rather from QC filtering. We can state that it's healthy to merge back the paired and single in HTSeqc-count, isn't it? Qualitatively speaking et least. About the counting, on one hand 1 mapped single read conduct to one count, on the other hand, a paired read will also be counted only once. Don't we overweight the single-end reads by simply summing single + paired end reads? |
![]() |
![]() |
![]() |
#4 |
Devon Ryan
Location: Freiburg, Germany Join Date: Jul 2011
Posts: 3,480
|
![]()
A single read and a pair both describe the position of the fragment that was sequenced. In both cases, you can consider that it's actually the fragment that's getting counted, so then nothing is being given undue weight. The only real objection to that is that single-end reads don't give you the full bounds, so there are cases where they'll lead to slightly inflated counts (e.g., when the other end of the fragment actually overlaps a different feature, but you have no way of knowing this), but the effect of that is likely quite small (again, you could judge this by clustering things).
|
![]() |
![]() |
![]() |
#5 |
Member
Location: Luxembourg Join Date: Nov 2011
Posts: 15
|
![]()
Great answer!
I agree with that and I'll go on with the proposed strategy, which is the following: 1. QC of fastq files, trimming .. 2. Alignment of single read, alignment of paired reads 3. HTSeq-count of single reads, HTSeq-count of paired reads 4. Sum of counts for each gene of single reads + paired reads 5. Happy EdgeR or DESeq or whatever... Thank you so much for your help!! |
![]() |
![]() |
![]() |
Tags |
htseq, htseq-count, merged bam, pair-end single-end, sam |
Thread Tools | |
|
|