Hi All:
I have a general question about testing differential exon usage using DEXSeq. Suppose I have a sample called SLR (from a cell line), and it is sequenced on 5 lanes, so I have BAM files like L1_SLR, L2_SLR, L3_SLR, L5_SLR and L7_SLR.bam. Here, the letter "L" denotes "lanes". In a typical RNA-Seq experiment, we should expect that reads from those 5 lanes have little variations.
Now, similarly, I have three technical replicates of the cell line SLR, let's call them TEC1, TEC2 and TEC3. For each technical replicate (say TEC1), it is also sequenced using 5 lanes. For example, I have L1_TEC1, L3_TEC1, L4_TEC1, L5_TEC1 and L7_TEC1. Similarly for TEC2 and TEC3.
Since TEC1, TEC2 and TEC3 are technical replicates of cell line SLR, ideally there should not be any differential exon usage detected. Our goal is to focus on these technical replicates and see if we'll obtain larger than expected amounts of differential exon usage. The tool we consider is DEXSeq. I looked at this manual here: http://bioconductor.org/packages/dev...te_objects.pdf. In Table 1, I think the author "concatenated" all the lanes (BAM files) into a single one for each replicate (treat1fb, etc), so we see "total" number of reads and exon counts. In my situation, can I know how to obtain
(1) the combined BAM file for SLR? Shall I cat the 5 lanes of BAM files? (after combination, I can get SAM file using samtools, and then use dexseq_count.py GFF_file \ SLR_sorted.sam SLR.txt)
(2) the GFF file is converted from GTF file using dexseq_prepare_annotation.py; can I know if the GTF file is just downloaded from Ensembl website (choose Homo Sapiens if my cell line is from human)?
(3) since my ultimate objective is to compare (maybe pairwise?) those technical replicates: SLR, TEC1, TEC2, TEC3, instead of the traditional situation of comparing biological sample 1 (with several tech reps) with biological sample 2 (with several tech reps). Would there be any problem? I can see that maybe comparing SLR vs. TEC1 is impossible (also for other pairwise comparisons) as there is no "replicates" of the replicate.
Thank you so much!
I have a general question about testing differential exon usage using DEXSeq. Suppose I have a sample called SLR (from a cell line), and it is sequenced on 5 lanes, so I have BAM files like L1_SLR, L2_SLR, L3_SLR, L5_SLR and L7_SLR.bam. Here, the letter "L" denotes "lanes". In a typical RNA-Seq experiment, we should expect that reads from those 5 lanes have little variations.
Now, similarly, I have three technical replicates of the cell line SLR, let's call them TEC1, TEC2 and TEC3. For each technical replicate (say TEC1), it is also sequenced using 5 lanes. For example, I have L1_TEC1, L3_TEC1, L4_TEC1, L5_TEC1 and L7_TEC1. Similarly for TEC2 and TEC3.
Since TEC1, TEC2 and TEC3 are technical replicates of cell line SLR, ideally there should not be any differential exon usage detected. Our goal is to focus on these technical replicates and see if we'll obtain larger than expected amounts of differential exon usage. The tool we consider is DEXSeq. I looked at this manual here: http://bioconductor.org/packages/dev...te_objects.pdf. In Table 1, I think the author "concatenated" all the lanes (BAM files) into a single one for each replicate (treat1fb, etc), so we see "total" number of reads and exon counts. In my situation, can I know how to obtain
(1) the combined BAM file for SLR? Shall I cat the 5 lanes of BAM files? (after combination, I can get SAM file using samtools, and then use dexseq_count.py GFF_file \ SLR_sorted.sam SLR.txt)
(2) the GFF file is converted from GTF file using dexseq_prepare_annotation.py; can I know if the GTF file is just downloaded from Ensembl website (choose Homo Sapiens if my cell line is from human)?
(3) since my ultimate objective is to compare (maybe pairwise?) those technical replicates: SLR, TEC1, TEC2, TEC3, instead of the traditional situation of comparing biological sample 1 (with several tech reps) with biological sample 2 (with several tech reps). Would there be any problem? I can see that maybe comparing SLR vs. TEC1 is impossible (also for other pairwise comparisons) as there is no "replicates" of the replicate.
Thank you so much!
Comment