Seqanswers Leaderboard Ad

**dpryan** · 06-18-2013, 12:29 AM

Originally posted by alittleboy View Post

(1) the combined BAM file for SLR? Shall I cat the 5 lanes of BAM files? (after combination, I can get SAM file using samtools, and then use dexseq_count.py GFF_file \ SLR_sorted.sam SLR.txt)

"samtools merge", or the picard tools equivalent.

Originally posted by alittleboy View Post

(2) the GFF file is converted from GTF file using dexseq_prepare_annotation.py; can I know if the GTF file is just downloaded from Ensembl website (choose Homo Sapiens if my cell line is from human)?

At least the version of that script that I have won't download things for you. Go ahead and get the human GTF annotation from Ensembl (don't get the one from UCSC, you'll thank me later).

Originally posted by alittleboy View Post

(3) since my ultimate objective is to compare (maybe pairwise?) those technical replicates: SLR, TEC1, TEC2, TEC3, instead of the traditional situation of comparing biological sample 1 (with several tech reps) with biological sample 2 (with several tech reps). Would there be any problem? I can see that maybe comparing SLR vs. TEC1 is impossible (also for other pairwise comparisons) as there is no "replicates" of the replicate.

The normal experiment is to compare group 1 with multiple biological replicates to group 2 with normal biological replicates. That's why you see people concatenating their datasets when they have technical replicates, it gives them higher depth. Can you describe the biological question that you're trying to answer with this? That might give people better insight into how to best help you.

**alittleboy** · 06-19-2013, 11:42 AM

Originally posted by dpryan View Post

"samtools merge", or the picard tools equivalent.

Thanks for the suggestion!

At least the version of that script that I have won't download things for you. Go ahead and get the human GTF annotation from Ensembl (don't get the one from UCSC, you'll thank me later).

Yes, I downloaded the Ensembl GTF file (thanks for the reminder! I also heard that Ensembl is better to use), but here is another question I have: please see this post.

The normal experiment is to compare group 1 with multiple biological replicates to group 2 with normal biological replicates. That's why you see people concatenating their datasets when they have technical replicates, it gives them higher depth. Can you describe the biological question that you're trying to answer with this? That might give people better insight into how to best help you.

The situation in my case is kind of different: we try to focus on the comparison of technical replicates, to see if different methods are consistent with the results (ideally there shouldn't be any differential exon usage since they're tech. reps.). We have four technical replicates, each having 5 lanes, and we sum up the counts in the 5 lanes for each replicate, so that we have a table like this:

Exon_ID CL_1 TR_1 TR_2 TR_3
E001 XX XX XX XX
... ...

In DEXSeq, I don't think we can compare cell line 1 (CL_1) with technical replicate 1 (TR_1) as there is no "replicate". How about, say, treating CL_1 and TR_1 as one group, and TR_2 and TR_3 as the other group, and compare the two groups? In this case, each group has two "replicates" that make DEXSeq estimation possible.

Thanks!

**dpryan** · 06-19-2013, 03:11 PM

Regarding your question in the other thread/on biostars, while you don't have to use the Ensembl GTF file, it really is the path of least resistance. I've previously used an annotation using Entrez IDs and just wrote a couple scripts to pacify dexseq_prepare_annotation.py. In effect, this resulted in the annotation resembling that from Ensembl. Out of curiousity, I've poked around the DEXSeq code a bit. read.HTSeqCounts will work without a GFF file, though I assume that at least the plotting functions won't then work. Unless you have a great desire to go through the DEXSeq code to see what other uses it makes of the annotation file, you're probably best off just contacting one of it's authors (maybe a PM if neither of them happen to see the threads you started).

If you think of your technical replicates as a single group, do they not show approximately Poisson variance? You're correct that DEXSeq isn't intended to compare individual samples (I've seen Simon Anders reply to that idea on this forum more times than I can count...the guy has the patience of a saint!). You could do as you suggested and just divide the 4 replicates into 2 groups. Of course, that's not really that informative for those of us doing normal experiments, since our variance will be higher. At the end of the day, I wonder if you're just testing how well the various programs estimate poisson noise with a small number of replicates (which is more of an argument that we should use more replicates than anything else).

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Yesterday, 08:47 AM	0 responses 12 views 0 likes	Last Post by seqadmin Yesterday, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

Using DEXSeq to compare differential exon usage from different technical replicates

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News