07022013, 01:16 PM 
Senior Member
Location: Ottawa Join Date: Apr 2011
Posts: 130

SSPACE libraries
Hello,
I am currently trying to scaffold an assembly with SSPACE, and can already see some progress. Code:
SUMMARY:  Inserted contig file; Total number of contigs = 5308 Sum (bp) = 42486218 Total number of N's = 158184 Sum (bp) no N's = 42328034 Max contig size = 134487 Min contig size = 1000 Average contig size = 8004 N50 = 17056 After scaffolding lib1: Total number of scaffolds = 3392 Sum (bp) = 42503701 Total number of N's = 178406 Sum (bp) no N's = 42325295 Max scaffold size = 190774 Min scaffold size = 1000 Average scaffold size = 12530 N50 = 27904 After scaffolding lib2: Total number of scaffolds = 1820 Sum (bp) = 42986473 Total number of N's = 661945 Sum (bp) no N's = 42324528 Max scaffold size = 365239 Min scaffold size = 1000 Average scaffold size = 23618 N50 = 50457  This is my library file: Code:
lib1 GDR16_65bp_R1.fastq GDR16_65bp_R2.fastq 280 0.8 FR lib2 MPNC_65bp_R1.fastq MPNC_65bp_R2.fastq 2411 0.5 FR I know in mate pairs you have a lot of paired end contamination, but the first library was paired end, and I had almost 50% of reads that did not satisfy the distance. I am pasting library stats: Lib1, paired end: Code:
LIBRARY lib1 STATS: ################################################################################ MAPPING READS TO CONTIGS:  Number of single reads found on contigs = 9149900 Number of pairs used for pairing contigs / total pairs = 3517598 / 3646662  READ PAIRS STATS: Assembled pairs: 3517598 (7035196 sequences) Satisfied in distance/logic within contigs (i.e. > <, distance on target: 280 +/224): 1668422 Unsatisfied in distance within contigs (i.e. distance outofbounds): 5097 Unsatisfied pairing logic within contigs (i.e. illogical pairing >>, << or <>): 7824  Satisfied in distance/logic within a given contig pair (prescaffold): 240094 Unsatisfied in distance within a given contig pair (i.e. calculated distances outofbounds): 1596161  Total satisfied: 1908516 unsatisfied: 1609082 Estimated insert size statistics (based on 1673519 pairs): Mean insert size = 240 Median insert size = 230 REPEATS: Number of repeated edges = 1665  ################################################################################ Code:
LIBRARY lib2 STATS: ################################################################################ MAPPING READS TO CONTIGS:  Number of single reads found on contigs = 5924956 Number of pairs used for pairing contigs / total pairs = 1560927 / 1708281  READ PAIRS STATS: Assembled pairs: 1560927 (3121854 sequences) Satisfied in distance/logic within contigs (i.e. > <, distance on target: 2411 +/1205.5): 129649 Unsatisfied in distance within contigs (i.e. distance outofbounds): 40359 Unsatisfied pairing logic within contigs (i.e. illogical pairing >>, << or <>): 427578  Satisfied in distance/logic within a given contig pair (prescaffold): 267259 Unsatisfied in distance within a given contig pair (i.e. calculated distances outofbounds): 696082  Total satisfied: 396908 unsatisfied: 1164019 Estimated insert size statistics (based on 170008 pairs): Mean insert size = 1951 Median insert size = 2237 REPEATS: Number of repeated edges = 1569  ################################################################################ Any thoughts on how I could improve my scaffolding? 
