I'm assembling some bacterial genomes for which there are no close reference sequences available. Reads are 70-90 bp after cleaning and trimming with fastq-mcf, coverage about 100x
I'm using abyss 1.3.7 and Ray 2.3.1 (others are in the planning) and use abyss-fac to get some metrics
How do I tell from these metrics what the 'best' assembly is., for example using abyss both k=31 and k=38 or k=39 seem to be OK. (the number before the fasta file is the kmer value i used)
Ray stats suggests that the assemblies at k=31 ,k=37, but also k=85 could be fine assemblies. ray seems to scaffold better
I'm a bit lost here how to assess these metrics. I would appreciate any suggestions.
Fenny
Abyss run:
n n:500 n:N50 min N80 N50 N20 E-size max sum name
13926 1609 228 500 3470 9163 22002 15036 72141 8202130 k20/unknown-contigs.fa
7381 899 120 505 7393 18774 43716 26836 107339 8500853 k21/unknown-contigs.fa
4213 505 66 504 14988 39642 71511 49270 215678 8644984 k22/unknown-contigs.fa
3341 429 52 505 18561 49387 92879 64247 292483 8710849 k23/unknown-contigs.fa
2999 382 47 513 20123 54351 103578 68436 241664 8720215 k24/unknown-contigs.fa
2373 298 37 578 29373 75091 133436 90121 309081 8765701 k25/unknown-contigs.fa
2110 275 31 534 33500 82613 175193 104953 361349 8782006 k26/unknown-contigs.fa
2043 275 35 517 38713 76396 143749 97326 309040 8774293 k27/unknown-contigs.fa
1670 250 25 517 39917 93324 199540 130157 406193 8814319 k28/unknown-contigs.fa
1514 248 27 519 42798 95525 190595 120848 362303 8824549 k29/unknown-contigs.fa
1411 242 26 521 43444 99602 199159 126189 363974 8848575 k30/unknown-contigs.fa
1283 231 21 523 43214 112193 302571 155113 414744 8869109 k31/unknown-contigs.fa
1209 242 25 525 43617 95477 237887 137683 414746 8871337 k32/unknown-contigs.fa
1183 240 24 527 42803 99289 237887 139646 414385 8866006 k33/unknown-contigs.fa
1036 221 23 610 47707 112215 237887 144304 414279 8882417 k34/unknown-contigs.fa
978 220 24 634 43232 99289 237887 141346 414279 8916246 k35/unknown-contigs.fa
937 217 23 636 51036 112215 220762 155820 570462 8900832 k36/unknown-contigs.fa
808 218 23 535 52818 112412 220814 149460 495193 8913342 k37/unknown-contigs.fa
765 210 20 537 53472 118915 245992 173014 535873 8915544 k38/unknown-contigs.fa
714 209 21 581 54738 118916 237887 160046 495233 8909314 k39/unknown-contigs.fa
660 213 22 583 53763 117753 237887 154655 495223 8936400 k40/unknown-contigs.fa
Ray run
n n:500 n:N50 min N80 N50 N20 E-size max sum name
178 121 15 770 62111 164112 363566 228332 760374 8782282 21/Scaffolds.fasta
175 112 14 770 67656 176067 362164 233702 757332 8739663 23/Scaffolds.fasta
177 113 15 683 72292 164206 389778 234641 757371 8696466 25/Scaffolds.fasta
135 98 14 2178 91732 188176 389831 246831 757371 8782374 27/Scaffolds.fasta
151 114 17 501 77399 153247 351688 214424 757371 8708387 29/Scaffolds.fasta
145 110 13 509 91732 168523 833846 344303 1121085 8807879 31/Scaffolds.fasta
149 114 13 509 92390 164110 716013 323769 1121085 8824925 33/Scaffolds.fasta
143 111 13 509 91732 164135 833855 343321 1121085 8813200 35/Scaffolds.fasta
144 109 13 509 91732 168523 833855 344524 1121085 8808399 37/Scaffolds.fasta
147 112 13 509 91732 168523 833846 342379 1121085 8814542 39/Scaffolds.fasta
151 115 14 509 91732 154479 716030 322057 1121085 8821523 41/Scaffolds.fasta
146 111 13 509 92390 168523 716030 325379 1121085 8808165 43/Scaffolds.fasta
148 112 13 509 91732 168523 833855 342560 1121085 8805456 45/Scaffolds.fasta
148 112 13 509 91732 168523 833855 342554 1121085 8805822 47/Scaffolds.fasta
149 113 13 509 92390 164110 716030 323939 1121085 8815467 49/Scaffolds.fasta
149 113 13 509 91732 164110 833855 342692 1121085 8815976 51/Scaffolds.fasta
142 109 12 509 92390 172838 833846 345337 1121085 8815343 53/Scaffolds.fasta
144 109 13 509 91732 164110 833838 343961 1121085 8808473 55/Scaffolds.fasta
146 111 13 509 91732 168523 833846 342407 1121085 8813776 57/Scaffolds.fasta
145 110 13 509 92390 164110 833846 343310 1121085 8815638 59/Scaffolds.fasta
145 110 13 509 91732 168523 833855 342783 1121085 8806084 61/Scaffolds.fasta
143 108 13 509 91732 168523 833838 344559 1121085 8808222 63/Scaffolds.fasta
148 112 13 509 92390 168523 716030 325211 1121085 8809623 65/Scaffolds.fasta
148 113 13 509 91732 168523 833855 342435 1121085 8813376 67/Scaffolds.fasta
140 107 12 509 95774 172838 833855 345732 1121085 8807698 69/Scaffolds.fasta
148 113 13 509 92390 164110 716030 323923 1121085 8816039 71/Scaffolds.fasta
142 109 12 509 95774 172838 833855 345409 1121085 8807219 73/Scaffolds.fasta
144 109 13 509 91732 168523 833855 344530 1121085 8808208 75/Scaffolds.fasta
144 111 12 509 91732 172838 833855 344861 1121085 8809073 77/Scaffolds.fasta
146 111 13 509 92390 168523 833846 344770 1121085 8808362 79/Scaffolds.fasta
145 110 13 509 92390 164110 833855 343305 1121085 8816026 81/Scaffolds.fasta
147 112 13 509 91732 164110 833838 342556 1121085 8814178 83/Scaffolds.fasta
143 108 13 509 91732 168523 833855 344556 1121085 8808270 85/Scaffolds.fasta
I'm using abyss 1.3.7 and Ray 2.3.1 (others are in the planning) and use abyss-fac to get some metrics
How do I tell from these metrics what the 'best' assembly is., for example using abyss both k=31 and k=38 or k=39 seem to be OK. (the number before the fasta file is the kmer value i used)
Ray stats suggests that the assemblies at k=31 ,k=37, but also k=85 could be fine assemblies. ray seems to scaffold better
I'm a bit lost here how to assess these metrics. I would appreciate any suggestions.
Fenny
Abyss run:
n n:500 n:N50 min N80 N50 N20 E-size max sum name
13926 1609 228 500 3470 9163 22002 15036 72141 8202130 k20/unknown-contigs.fa
7381 899 120 505 7393 18774 43716 26836 107339 8500853 k21/unknown-contigs.fa
4213 505 66 504 14988 39642 71511 49270 215678 8644984 k22/unknown-contigs.fa
3341 429 52 505 18561 49387 92879 64247 292483 8710849 k23/unknown-contigs.fa
2999 382 47 513 20123 54351 103578 68436 241664 8720215 k24/unknown-contigs.fa
2373 298 37 578 29373 75091 133436 90121 309081 8765701 k25/unknown-contigs.fa
2110 275 31 534 33500 82613 175193 104953 361349 8782006 k26/unknown-contigs.fa
2043 275 35 517 38713 76396 143749 97326 309040 8774293 k27/unknown-contigs.fa
1670 250 25 517 39917 93324 199540 130157 406193 8814319 k28/unknown-contigs.fa
1514 248 27 519 42798 95525 190595 120848 362303 8824549 k29/unknown-contigs.fa
1411 242 26 521 43444 99602 199159 126189 363974 8848575 k30/unknown-contigs.fa
1283 231 21 523 43214 112193 302571 155113 414744 8869109 k31/unknown-contigs.fa
1209 242 25 525 43617 95477 237887 137683 414746 8871337 k32/unknown-contigs.fa
1183 240 24 527 42803 99289 237887 139646 414385 8866006 k33/unknown-contigs.fa
1036 221 23 610 47707 112215 237887 144304 414279 8882417 k34/unknown-contigs.fa
978 220 24 634 43232 99289 237887 141346 414279 8916246 k35/unknown-contigs.fa
937 217 23 636 51036 112215 220762 155820 570462 8900832 k36/unknown-contigs.fa
808 218 23 535 52818 112412 220814 149460 495193 8913342 k37/unknown-contigs.fa
765 210 20 537 53472 118915 245992 173014 535873 8915544 k38/unknown-contigs.fa
714 209 21 581 54738 118916 237887 160046 495233 8909314 k39/unknown-contigs.fa
660 213 22 583 53763 117753 237887 154655 495223 8936400 k40/unknown-contigs.fa
Ray run
n n:500 n:N50 min N80 N50 N20 E-size max sum name
178 121 15 770 62111 164112 363566 228332 760374 8782282 21/Scaffolds.fasta
175 112 14 770 67656 176067 362164 233702 757332 8739663 23/Scaffolds.fasta
177 113 15 683 72292 164206 389778 234641 757371 8696466 25/Scaffolds.fasta
135 98 14 2178 91732 188176 389831 246831 757371 8782374 27/Scaffolds.fasta
151 114 17 501 77399 153247 351688 214424 757371 8708387 29/Scaffolds.fasta
145 110 13 509 91732 168523 833846 344303 1121085 8807879 31/Scaffolds.fasta
149 114 13 509 92390 164110 716013 323769 1121085 8824925 33/Scaffolds.fasta
143 111 13 509 91732 164135 833855 343321 1121085 8813200 35/Scaffolds.fasta
144 109 13 509 91732 168523 833855 344524 1121085 8808399 37/Scaffolds.fasta
147 112 13 509 91732 168523 833846 342379 1121085 8814542 39/Scaffolds.fasta
151 115 14 509 91732 154479 716030 322057 1121085 8821523 41/Scaffolds.fasta
146 111 13 509 92390 168523 716030 325379 1121085 8808165 43/Scaffolds.fasta
148 112 13 509 91732 168523 833855 342560 1121085 8805456 45/Scaffolds.fasta
148 112 13 509 91732 168523 833855 342554 1121085 8805822 47/Scaffolds.fasta
149 113 13 509 92390 164110 716030 323939 1121085 8815467 49/Scaffolds.fasta
149 113 13 509 91732 164110 833855 342692 1121085 8815976 51/Scaffolds.fasta
142 109 12 509 92390 172838 833846 345337 1121085 8815343 53/Scaffolds.fasta
144 109 13 509 91732 164110 833838 343961 1121085 8808473 55/Scaffolds.fasta
146 111 13 509 91732 168523 833846 342407 1121085 8813776 57/Scaffolds.fasta
145 110 13 509 92390 164110 833846 343310 1121085 8815638 59/Scaffolds.fasta
145 110 13 509 91732 168523 833855 342783 1121085 8806084 61/Scaffolds.fasta
143 108 13 509 91732 168523 833838 344559 1121085 8808222 63/Scaffolds.fasta
148 112 13 509 92390 168523 716030 325211 1121085 8809623 65/Scaffolds.fasta
148 113 13 509 91732 168523 833855 342435 1121085 8813376 67/Scaffolds.fasta
140 107 12 509 95774 172838 833855 345732 1121085 8807698 69/Scaffolds.fasta
148 113 13 509 92390 164110 716030 323923 1121085 8816039 71/Scaffolds.fasta
142 109 12 509 95774 172838 833855 345409 1121085 8807219 73/Scaffolds.fasta
144 109 13 509 91732 168523 833855 344530 1121085 8808208 75/Scaffolds.fasta
144 111 12 509 91732 172838 833855 344861 1121085 8809073 77/Scaffolds.fasta
146 111 13 509 92390 168523 833846 344770 1121085 8808362 79/Scaffolds.fasta
145 110 13 509 92390 164110 833855 343305 1121085 8816026 81/Scaffolds.fasta
147 112 13 509 91732 164110 833838 342556 1121085 8814178 83/Scaffolds.fasta
143 108 13 509 91732 168523 833855 344556 1121085 8808270 85/Scaffolds.fasta
Comment