SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   Fastest way to add Ns to variable length sequences to get uniform length (http://seqanswers.com/forums/showthread.php?t=74419)

Illusive Man 02-23-2017 08:14 PM

Fastest way to add Ns to variable length sequences to get uniform length
 
I have a fasta file with a thousand sequences with a distribution of lengths between 100 and 150bp. I would like to add Ns to all sequences whose length is <150. I know it is possible but thus far I have yet to find anything to easily do this. Please help. Thanks!

Brian Bushnell 02-23-2017 09:44 PM

Cross-posted:

https://www.biostars.org/p/238635/

Please link other sites when you cross-post, so people don't waste their time answering a question that has already been answered.

jkbonfield 02-24-2017 08:15 AM

Not fast, but something like:

Code:

perl -e 'while(<>) {if (/^>/) {print;next}; chomp; print $_,"N" x (150-length($_)),"\n"}'  in.fasta

gringer 02-24-2017 09:05 AM

That looks like it should be plenty fast enough, more likely to be limited by the read speed of the hard drive than the speed of the code.


All times are GMT -8. The time now is 01:06 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.