SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
Trinity transcriptome assembly upendra_35 De novo discovery 11 06-04-2015 09:34 AM
De novo assembly using Trinity ankitarathore RNA Sequencing 5 10-28-2014 08:03 AM
Trinity Assembly error yuanhuang Bioinformatics 7 04-24-2014 05:38 AM
Trinity Assembly, Isoforms vs. Paralogs? BGould Bioinformatics 1 01-15-2014 12:51 AM
Results of denovo assembly from trinity and velvet. vishwesh Bioinformatics 1 01-05-2014 10:03 PM

Reply
 
Thread Tools
Old 09-22-2014, 06:07 PM   #1
Bang_Didi
Junior Member
 
Location: Townsville, Australia

Join Date: Sep 2014
Posts: 5
Default Trinity Assembly

Dear All..
As a newbie in transcriptome analysis, I would like to ask a question about doing whole transcriptome assembly using trinity.

Is it possible for us to get two different transcripts when we assemble our reads by either concatenating the reads or listed the reads (using comma separation as Trinity manual says)?. I am just not sure with the results that I got, It seems that I got different transcripts (can tell this from its size which is different) using these two different method in preparing my reads for the assembly using Trinity. Any thought what went wrong ?

Cheers
Didi
Bang_Didi is offline   Reply With Quote
Old 09-23-2014, 06:42 AM   #2
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

Trinity is non-deterministic thus some variation between runs of it are expected. Not a lot but some.
westerman is offline   Reply With Quote
Old 09-23-2014, 04:18 PM   #3
Bang_Didi
Junior Member
 
Location: Townsville, Australia

Join Date: Sep 2014
Posts: 5
Default

Thanks for that westerman... Should I worry that the variation will also significantly be expressed when I construct the metrics for the transcripts evaluation?
Bang_Didi is offline   Reply With Quote
Old 09-23-2014, 05:00 PM   #4
Bang_Didi
Junior Member
 
Location: Townsville, Australia

Join Date: Sep 2014
Posts: 5
Default

FYI:

The Trinity stats that I got for the transcript that was built from concatenated data:
################################
## Counts of transcripts, etc.
################################
Total trinity 'genes': 236322
Total trinity transcripts: 518647
Percent GC: 45.98

########################################
Stats based on ALL transcript contigs:
########################################

Contig N10: 8296
Contig N20: 6856
Contig N30: 5744
Contig N40: 4826
Contig N50: 4031

Median contig length: 1217
Average contig: 2100.35
Total assembled bases: 1,089,337,664


#####################################################
## Stats based on ONLY LONGEST ISOFORM per 'GENE':
#####################################################

Contig N10: 6119
Contig N20: 4351
Contig N30: 3248
Contig N40: 2367
Contig N50: 1635

Median contig length: 367
Average contig: 799.05
Total assembled bases: 188,834,004

The Trinity stats that I got for the transcript that was built from listing all of the reads using comma separation:

################################
## Counts of transcripts, etc.
################################
Total trinity 'genes': 244,160
Total trinity transcripts: 301,140
Percent GC: 44.75

########################################
Stats based on ALL transcript contigs:
########################################

Contig N10: 6864
Contig N20: 5185
Contig N30: 4130
Contig N40: 3303
Contig N50: 2581

Median contig length: 448
Average contig: 1115.03
Total assembled bases: 335,781,132


#####################################################
## Stats based on ONLY LONGEST ISOFORM per 'GENE':
#####################################################

Contig N10: 5852
Contig N20: 4184
Contig N30: 3122
Contig N40: 2305
Contig N50: 1623

Median contig length: 374
Average contig: 806.19
Total assembled bases: 196,840,230
Bang_Didi is offline   Reply With Quote
Old 09-25-2014, 05:44 AM   #5
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

Those variations are more than I would expect and I can see why you are concerned. I'll see if I can fire up a recent Trinity assembly (I almost always use comma separated files) with combined reads and see what differences I get.
westerman is offline   Reply With Quote
Old 11-01-2014, 02:37 PM   #6
ltutar
Junior Member
 
Location: turkey

Join Date: Dec 2013
Posts: 1
Default

Dear Bang_Didi,

Did you make a decision which way is the best comma separation or combining?



Quote:
Originally Posted by Bang_Didi View Post
FYI:

The Trinity stats that I got for the transcript that was built from concatenated data:
################################
## Counts of transcripts, etc.
################################
Total trinity 'genes': 236322
Total trinity transcripts: 518647
Percent GC: 45.98

########################################
Stats based on ALL transcript contigs:
########################################

Contig N10: 8296
Contig N20: 6856
Contig N30: 5744
Contig N40: 4826
Contig N50: 4031

Median contig length: 1217
Average contig: 2100.35
Total assembled bases: 1,089,337,664


#####################################################
## Stats based on ONLY LONGEST ISOFORM per 'GENE':
#####################################################

Contig N10: 6119
Contig N20: 4351
Contig N30: 3248
Contig N40: 2367
Contig N50: 1635

Median contig length: 367
Average contig: 799.05
Total assembled bases: 188,834,004

The Trinity stats that I got for the transcript that was built from listing all of the reads using comma separation:

################################
## Counts of transcripts, etc.
################################
Total trinity 'genes': 244,160
Total trinity transcripts: 301,140
Percent GC: 44.75

########################################
Stats based on ALL transcript contigs:
########################################

Contig N10: 6864
Contig N20: 5185
Contig N30: 4130
Contig N40: 3303
Contig N50: 2581

Median contig length: 448
Average contig: 1115.03
Total assembled bases: 335,781,132


#####################################################
## Stats based on ONLY LONGEST ISOFORM per 'GENE':
#####################################################

Contig N10: 5852
Contig N20: 4184
Contig N30: 3122
Contig N40: 2305
Contig N50: 1623

Median contig length: 374
Average contig: 806.19
Total assembled bases: 196,840,230
ltutar is offline   Reply With Quote
Old 01-11-2015, 07:45 PM   #7
Nanu
Member
 
Location: New Delhi

Join Date: Sep 2014
Posts: 30
Default

Greetings to all!

I would like to know about the reads/kmers per transcripts. As the TrinityStats.pl tells the total assembled bases. contig length. no . of transcripts as longest isoform. So I would like to know about the difference between Trinity.fasta and single.fasta.
When we execute the TrinityStats.pl , we know about the
1. Stats based on ONLY LONGEST ISOFORM per 'GENE
2.Stats based on ALL transcript contigs

May i know that Trinity.fasta contains all transcripts or it has genes also. ?
Nanu is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:08 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO