SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Illumina/Solexa



Similar Threads
Thread Thread Starter Forum Replies Last Post
Dealing with super abundant transcripts in RNAseq kirby Bioinformatics 10 03-15-2013 06:12 AM
How to use cufflinks only to assemble transcripts, but not estimate abundance? manianslab Bioinformatics 2 06-26-2012 05:44 AM
cufflinks not able to assemble transcripts in high read coverage area apratap Bioinformatics 2 11-03-2011 07:21 PM
Removal of retained introns / primary transcripts from de novo RNAseq assembly sandmann RNA Sequencing 1 07-29-2011 08:54 AM
TopHat & Cufflinks failing to assemble full length transcripts jlhaner Bioinformatics 3 10-13-2010 10:46 AM

Reply
 
Thread Tools
Old 09-18-2012, 02:24 AM   #1
minoru_harvest
Junior Member
 
Location: China

Join Date: Aug 2012
Posts: 5
Default assemble RNAseq data with Oases but so many transcripts in locus_1

Hello everyone

I'm doing transcriptome assembly work with Oases.
The illumina data's read length is 100nt and its quality is pretty good. I filterred bed reads, trimmed bases of low quality and trimmed adapters. Then did the Oases_pipeline in this command:
Code:
nohup oases_pipeline.py -m 17 -M 71 -s 2 -g 27 -o Hg0912 -d " -short Hg-trim.fasta " &
The final result in transcripts.fa seems not so good. The first locus "locus_1" has so many transcripts, which exceeds 600,000.
Quote:
>Locus_1_Transcript_1/656445_Confidence_0.000_Length_257
TTATTTTCTTCCTGTTGTTTTCAGTACGAGCCAGTTGAGATGCGCGTGAGTTTATAAACA
AAACCTGTGTCCCCGATTGGCCAGTAAGTAGCCGGCAACCGACACGGACGTTGTACTTGT
ATTGAGCAAAGTTTATTCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACA
TCCCGACGAATTGAGAACCTCTCCTTTTTGGGATAAAAAAAAAAAAAAAAAAAAAGTTGT
ATTTTGTGTTTCAAAGT
>Locus_1_Transcript_2/656445_Confidence_0.000_Length_505
CATTCCTTGTATTCAAAAAAAAAAAAAAAAAAAAAAAAATGAAACACGTCAACAAAAAAA
AAAAAAGGAACCCTTATTCTTAGAGAATTAAGACTTTTGCAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAACATCCCGACGAATTGAGAACCTCTCCTTTTTGGGATAAAAAAAAAAA
AAAAAAAAAAGCATGAAAAAAAAAAAATGCAGAAATCCCACTTTACTTTTAGATAAATAT
TGCAAATTTGCGATACATAACATACATTAATTACATATAGGTAACTGTTTATTTTAAGGC
AAATTCTTAGAAAAAACTAAGAAGTCCTGGATCAACTAAAAAATACAGCTCTCGAACGTC
GCTCTTACAATTTTAAAACCAAGTTCCTTGAGTTAAAAATTGGAAAAAGTCGCGCTCGCT
CCGCTCGCGATTTAGAAGCGATGTGCTTGTTTTTGCATTCGCCGGCCAACCAACAAAAAA
TTATGGACGTTTGAGCTACACTTAT
>Locus_1_Transcript_3/656445_Confidence_0.000_Length_407
TCAAAAAAAAAAAAAAAAAAAAAAAAATGAAACACGTCAACAAAAAAAAAAAAAGGAACC
CTTATTCTTAGAGAATTAAGACTTTTGCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAC
ATCCCGACGAATTGAGAACCTCTCCTTTTTGGGATAAAAAAAAAAAAAAAAAAAAAGCAC
GTCAACAAAAAAAAAAAAAGGAACCACATTGAATTTTGTGTTACGTCACTACTTTTAGGC
......
And I pick out a long sequence to do blastx ,nothing similar found. But when blastn, the sequence can match to my species' genome sequences(my species do not have a complete genome sequneced, the matched subjects are something like BAC library).

Is this normal?
minoru_harvest is offline   Reply With Quote
Old 04-19-2013, 01:03 PM   #2
mariruilo
Junior Member
 
Location: Oregon

Join Date: Dec 2012
Posts: 7
Default

Hi Minoru,

I'm having the same problem. Did you find an answer for it? Thanks.
mariruilo is offline   Reply With Quote
Old 04-19-2013, 02:46 PM   #3
Cofactor Genomics
Registered Vendor
 
Location: St. Louis

Join Date: Jan 2010
Posts: 52
Default

Hi,
Although our team has not seen this many transcripts under a given locus... what we have tended to do is set those loci with a given blastx hit as tier one transcripts (ie given them a higher ranking/priority over the other transcripts).

On another note, not necessarily related to a scenario of the 6,000 transcripts under a single locus, but focussed on the results coming out of a transcriptome assembly: We found that the molecular biology protocols and depletions have a large impact on the resulting transcriptome assembly. In the early days before we had the protocols nailed down we found that a very large percent of our loci did not have blastx hits, as our molecular biology (ie not our bioinformatics) became more refined, we saw a much higher percent of loci matching to genes.

I hope this is helpful.

Jarret Glasscock
Cofactor Genomics
http://www.cofactorgenomics.com
Cofactor Genomics is offline   Reply With Quote
Old 04-19-2013, 03:45 PM   #4
mariruilo
Junior Member
 
Location: Oregon

Join Date: Dec 2012
Posts: 7
Default

Thank you very much for the answer, it is very helpful.

Mari
mariruilo is offline   Reply With Quote
Reply

Tags
locus, oases, problem

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:06 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO