![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
shuffleSequences.pl script in velvet | pbm13 | RNA Sequencing | 7 | 05-19-2015 09:05 AM |
ask perl script: break contigs into overlapping sequences | pony2001mx | Bioinformatics | 10 | 10-23-2013 12:34 AM |
Basic statistics from alignment using bwa | NGS_New_User | Bioinformatics | 1 | 10-15-2012 12:55 AM |
Velvet compilation: basic question | gmer | Bioinformatics | 1 | 05-31-2012 09:46 AM |
number of contigs in velvet | bioenvisage | Bioinformatics | 6 | 03-24-2010 09:10 PM |
![]() |
|
Thread Tools |
![]() |
#1 |
Senior Member
Location: UK Join Date: Jul 2013
Posts: 131
|
![]()
could I get a script that can generate some basic statistics from the velvet output file (contigs.fa)
|
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: Stuttgart, Germany Join Date: Apr 2010
Posts: 192
|
![]()
Please be more specific about what you need and what your understanding of "statistics from the velvet output" is. Do you mean N50, number of Contigs, longest, shortest....
|
![]() |
![]() |
![]() |
#3 |
Senior Member
Location: UK Join Date: Jul 2013
Posts: 131
|
![]()
Thanks. to be more specific I would like to see the following statistics for velvet output fasta file:
-Statistics for contig lengths: Min contig length Max contig length Mean contig length standard deviation of contig length Median contig length N50 contig length - Statistics for no. of contigs: No. of contigs No of contigs >=1kb No. of contigs in N50 -Statistics for bases in the contigs: No. of bases in all contigs No. of bases in contigs >=1kb GC content of contigs -Simple Dinucleotide repeats: No. of contigs with over 70% dinucleotide repeats AT CG AC TG AG TC _Simple mononucleotide repeats: No. of contigs with over 50% dinucleotide repeats AA TT CC GG |
![]() |
![]() |
![]() |
#4 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,082
|
![]() |
![]() |
![]() |
![]() |
#5 |
Senior Member
Location: Stuttgart, Germany Join Date: Apr 2010
Posts: 192
|
![]()
i don't which of your interest will be covered but have a look here velet std summary. I find the way they do it quite neat. However, i think you have to script something yourself to get all desired information.
|
![]() |
![]() |
![]() |
#6 |
Senior Member
Location: Denmark Join Date: Apr 2009
Posts: 153
|
![]()
Using Biopieces (www.biopieces.org):
Assembled contigs can be analyzed to get some stats using analyze_assembly. We include a filtering step to discard contigs shorter than 200 bases: Code:
read_fasta -i contigs.fna | grab -e "SEQ_LEN>=200" | analyze_assembly -x Code:
N50: 9082 MAX: 52038 MIN: 200 MEAN: 4170 TOTAL: 3057214 COUNT: 733 --- https://code.google.com/p/biopieces/...aning_NGS_data |
![]() |
![]() |
![]() |
Thread Tools | |
|
|