Seqanswers Leaderboard Ad

**nickloman** · 01-05-2011, 07:28 AM

The file 454Scaffolds.txt generated by Newbler has the information you need.

See http://contig.wordpress.com/2010/03/...-file/#more-56 for more information.

**avtsanger** · 01-05-2011, 08:07 AM

I should clarify: the assembly was imported in to gap4 and worked on by joining contigs to the scaffold consensus and closing gaps computationally or experimentally. I can save out the updated consensus files but these will still contain n's due to the scaffold sequence that I joined in. I need to find a way of calculating the number of contigs ie. the number of sequences separated by n's in this file.

**westerman** · 01-05-2011, 10:40 AM

Assuming you are a unix type system, one answer is to use the 'tr' command along with 'sed' and 'wc'. First get rid of the fasta headers. Then get rid of the newlines. Then reduce all the of the 'n's to a single character. Finally delete all non-n's and then count up the remaining n's. That number will represent the number of gaps you have plus one thus the number of contigs.

sed -e 's/>.*/n/' scaffold.fasta | tr -d '\n' | tr -s 'n' | tr -d 'acgt' | wc -c

The above assumes only acgtn in lower case. I suspect there are as many other answers as there are people on this bulletin board.

**flxlex** · 01-05-2011, 09:57 PM

Originally posted by westerman View Post

I suspect there are as many other answers as there are people on this bulletin board.

Perhaps, but yours is hard to beat for shortness...

**avtsanger** · 01-06-2011, 01:43 AM

Thanks. That's great and gives me the answer I was looking for.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 39 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 41 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 35 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Calculating the number of contigs in a scaffold file

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News