![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Statistics after mapping | jomaco | Bioinformatics | 0 | 01-26-2012 12:12 PM |
what statistics and tool to use? | vebaev | Bioinformatics | 16 | 08-19-2011 01:37 AM |
Looking for some statistics on Roche(454), Illumina & SOLiD platforms | Risha | Bioinformatics | 1 | 08-30-2010 07:20 AM |
Looking for simple statistics on Roche(454), Illumina & SOLiD platforms | Risha | Introductions | 0 | 08-29-2010 03:05 PM |
Statistics for Biologists | mgabrielli@partek.com | Events / Conferences | 0 | 06-07-2010 07:41 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: València, Spain Join Date: Apr 2009
Posts: 48
|
![]()
Hi all!
3 years ago we did a 454 run over a transcriptome data and using the Newbler release 1.1.03.24 we got some statistics in order to know the mean number of assembled reads per contig, the number of contigs with only 2 reads and so on...as we thought that a read could be assigned only to a single contig. Today, we've used the 2.6 release software and the number of reads we've got from 454Allcontigs.fna ("numreads=" column) is larger than the total number of assembled reads. That's because a read could be assigned to a multiple contigs, isn't it? (as the contigs are "exons") If true, how kind of statistics do you advise in order to compare both sets of data?? From newblerMetrics I got that 83.48% of reads were assembled but I want to get such a value from the assembled contigs file as I've seen something like: >contig00030 length=1 numreads=48 gene=isogroup00001 status=ig_thresh t >contig00031 length=6 numreads=4495 gene=isogroup00001 status=ig_thresh CACTTC >contig00032 length=3 numreads=61 gene=isogroup00001 status=ig_thresh GgA >contig00033 length=3 numreads=345 gene=isogroup00001 status=ig_thresh gtA >contig00034 length=2 numreads=2030 gene=isogroup00001 status=ig_thresh TA >contig00035 length=1 numreads=1914 gene=isogroup00001 status=ig_thresh A I hope I was clear enough! Thanks in advance. |
![]() |
![]() |
![]() |
#2 |
Rick Westerman
Location: Purdue University, Indiana, USA Join Date: Jun 2008
Posts: 1,104
|
![]()
Parsing the 454ReadStatus.txt file may the best solution.
|
![]() |
![]() |
![]() |
#3 |
Member
Location: València, Spain Join Date: Apr 2009
Posts: 48
|
![]()
Thank you westerman. You are right, that seems to be the right place to find out the solution, but what kind of reads it's going to be assembled?? I mean: in 454NewblerMetrics file you have assembled reads, partial reads, singletons, repeat reads, outliers and tooshort (they appear in the last software release I think). I've read the flxlex blog (http://contig.wordpress.com/2010/03/...rics-txt-file/) and it seems that Assembled plus Partial and Repeat should be the number of aligned repeats...
However, what does it mean the "numreads=" in the 454Allcontigs.fna file?? |
![]() |
![]() |
![]() |
#4 |
Rick Westerman
Location: Purdue University, Indiana, USA Join Date: Jun 2008
Posts: 1,104
|
![]()
I am going to guess here because I am deep into looking at other problems at the moment (although it should be easy to find out with a bit of digging) that the 'numreads=' is the count of all reads that contribute to the contig, no matter if that read maps uniquely to that contig or not.
|
![]() |
![]() |
![]() |
#5 | |
Moderator
Location: Oslo, Norway Join Date: Nov 2008
Posts: 415
|
![]() Quote:
|
|
![]() |
![]() |
![]() |
Thread Tools | |
|
|