SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Setting up Linux for SOAP de novo samzorn1 Bioinformatics 8 09-11-2014 08:36 PM
SOAP De novo trans - limiting output by contig size lahcen86 Bioinformatics 0 06-23-2014 04:14 AM
Soap de novo assembly of biological replicates Arpittandon Bioinformatics 0 02-06-2014 10:24 AM
Soap de novo trans parameters RyNkA Bioinformatics 3 01-09-2014 07:45 AM
Soap De Novo assembly interpretation NGS_New_User Bioinformatics 2 08-13-2013 08:38 AM

Reply
 
Thread Tools
Old 01-12-2015, 02:16 AM   #1
standonn
Member
 
Location: UK

Join Date: Nov 2014
Posts: 14
Default Weird stats output of SOAP de Novo

Dear all,

I have run SOAP de Novo to assemble a nematode genome.
SOAP de Novo output a stats file .scafStatistics which I do not understand. I am especially confused about the N50 values. Why are there 2 values?

Here is the output:

<-- Information for assembly Scaffold 'SoapOutput-SB372.scafSeq'.(cut_off_length < 100bp) -->

Size_includeN 76698637
Size_withoutN 65796073
Scaffold_Num 15259
Mean_Size 5026
Median_Size 160
Longest_Seq 1079942
Shortest_Seq 100
Singleton_Num 11712
Average_length_of_break(N)_in_scaffold 714

Known_genome_size NaN
Total_scaffold_length_as_percentage_of_known_genome_size NaN

scaffolds>100 15047 98.61%
scaffolds>500 4059 26.60%
scaffolds>1K 3004 19.69%
scaffolds>10K 688 4.51%
scaffolds>100K 216 1.42%
scaffolds>1M 1 0.01%

Nucleotide_A 18790176 24.50%
Nucleotide_C 14159122 18.46%
Nucleotide_G 14204857 18.52%
Nucleotide_T 18641918 24.31%
GapContent_N 10902564 14.21%
Non_ACGTN 0 0.00%
GC_Content 43.11% (G+C)/(A+C+G+T)

N10 488263 12
N20 319382 32
N30 235181 60
N40 182496 96
N50 138908 144
N60 102282 210
N70 73846 298
N80 43901 429
N90 5795 899

NG50 NaN NaN
N50_scaffold-NG50_scaffold_length_difference NaN

<-- Information for assembly Contig 'SoapOutput-SB372.contig'.(cut_off_length < 100bp) -->

Size_includeN 66764916
Size_withoutN 66764916
Contig_Num 69780
Mean_Size 956
Median_Size 458
Longest_Seq 33978
Shortest_Seq 100

Contig>100 69392 99.44%
Contig>500 33098 47.43%
Contig>1K 20004 28.67%
Contig>10K 138 0.20%
Contig>100K 0 0.00%
Contig>1M 0 0.00%

Nucleotide_A 19146203 28.68%
Nucleotide_C 14420728 21.60%
Nucleotide_G 14387230 21.55%
Nucleotide_T 18810755 28.17%
GapContent_N 0 0.00%
Non_ACGTN 0 0.00%
GC_Content 43.15% (G+C)/(A+C+G+T)

N10 6141 779
N20 4338 2094
N30 3326 3858
N40 2586 6144
N50 2011 9076
N60 1536 12880
N70 1122 17959
N80 755 25179
N90 410 37034

NG50 NaN NaN
N50_contig-NG50_contig_length_difference NaN

Number_of_contigs_in_scaffolds 58068
Number_of_contigs_not_in_scaffolds(Singleton) 11712
Average_number_of_contigs_per_scaffold 16.4

I have looked all over for the answer but didnīt manage to find it.

All the best,
Sophie
standonn is offline   Reply With Quote
Old 06-10-2018, 07:13 PM   #2
al-ash
Junior Member
 
Location: Prague, Czech Republic

Join Date: Dec 2016
Posts: 1
Default

The first one is scaffold N50, the second is contig N500. Look for
Quote:
<-- Information for assembly Scaffold 'SoapOutput-SB372.scafSeq'.(cut_off_length < 100bp) -->
resp
Quote:
<-- Information for assembly Contig 'SoapOutput-SB372.contig'.(cut_off_length < 100bp) -->
to see what type of sequences does the statistics refer to.
al-ash is offline   Reply With Quote
Reply

Tags
n50, soap denovo, statistics, stats

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:12 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO