Seqanswers Leaderboard Ad

**RockChalkJayhawk** · 08-19-2010, 06:52 AM

Originally posted by internet_nobody View Post

Hi everyone,
The EMBL track is automated annotations, and I didn't use them to guide tophat as we "know" a lot of them are wrong, but 01890 and 01880 are definitely seperate genes, based on experimental results
Thanks.

Starting with the most basic questions, 1) are you looking at the same genome assembly to which it was aligned?

2) What parameters were used to generate these results?

3) Is it possible that this is an operon, where run-on transcription occurs but synthesizes multiple proteins?

**internet_nobody** · 08-19-2010, 08:34 AM

Thanks for replying.

1) Yes it's the same assembly.

2) I now realise this weren't the best, I interpreted "ends" as "adapters" for the -r parameter, and reading other topics see that it meant "read length": tophat -r 140 --mate-std-dev 50 -i 50 s_3_1_sequences.txt s_3_2_sequences.txt
Anything Cufflinks also had I put as the same, the rest I left as default.

3) That's something I hadn't thought of, but it doesn't fit with microarray results showing different expression profiles for the mRNAs.

**RockChalkJayhawk** · 08-19-2010, 08:43 AM

Originally posted by internet_nobody View Post

Thanks for replying.

1) Yes it's the same assembly.

2) I now realise this weren't the best, I interpreted "ends" as "adapters" for the -r parameter, and reading other topics see that it meant "read length": tophat -r 140 --mate-std-dev 50 -i 50 s_3_1_sequences.txt s_3_2_sequences.txt
Anything Cufflinks also had I put as the same, the rest I left as default.

3) That's something I hadn't thought of, but it doesn't fit with microarray results showing different expression profiles for the mRNAs.

How did you come up with -r 140?
I would have come up with 240 (size selected) - 150 (2*75bp reads) - ~100 (primer length) = -10

**internet_nobody** · 08-19-2010, 09:03 AM

Yes I know that now, but after asking someone else how they interpreted it they did 240 - 100 (2 x 50bp adapters) = 140 (I had assumed 0, as I couldn't find anything about using a negative number, and their argument that it couldn't be 0 beat my conviction that it should be 0). It was only when I read the forum I realised that the read length should be included. I'm waiting on a re-run using -30 (I looked at a few of the paired ends by eye, and they had ~30bp overlap, so perhaps the person that prepared the library cut a higher weight band than expected), which takes around 12 hours so i'll know soon enough. I wasn't sure if that would have been a big enough mistake to have had that much of an affect on the results.

I'm hoping that also explains regions where many reads have aligned, but tophat/cufflinks don't pick anything up there...

Topics	Statistics	Last Post
Telomere Maintenance by PARP1: A New Perspective in Cancer Research by seqadmin Started by seqadmin, Yesterday, 06:57 AM	0 responses 12 views 0 likes	Last Post by seqadmin Yesterday, 06:57 AM
Enhanced Neoantigen Detection: Introducing NeoHunter by seqadmin Started by seqadmin, 05-06-2024, 07:17 AM	0 responses 16 views 0 likes	Last Post by seqadmin 05-06-2024, 07:17 AM
A Close Examination at Probiotic-Related Bacteremia by seqadmin Started by seqadmin, 05-02-2024, 08:06 AM	0 responses 19 views 0 likes	Last Post by seqadmin 05-02-2024, 08:06 AM
Expanded Genetic Insights into Blood Pressure Regulation by seqadmin Started by seqadmin, 04-30-2024, 12:17 PM	0 responses 24 views 0 likes	Last Post by seqadmin 04-30-2024, 12:17 PM

Seqanswers Leaderboard Ad

Announcement

Tophat/Cufflinks newbie - question about transcript assembly

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News