View Single Post
Old 01-11-2012, 02:35 PM   #1
kga1978
Senior Member
 
Location: Boston, MA

Join Date: Nov 2010
Posts: 100
Question Plotting length distribution of contigs?

Hi all,

I have a fasta file with contigs and I would like to plot the lengths of the sequences. I have only found prinseq-graphs that can do that, but it requires me to install a bunch of perl modules (I believe - I can't find any installation information). Is there an easier way of doing this? Preferably a simple script or java program. Here's an example of some of my data:

Code:
>NODE_4_length_492_cov_13.477642
ACGGAGTATGACTTTGTATTGGTGGGTCCTTGCACTGAACCAGCCCCTCTGGTTGTGCAT
AGGGGAGGCTTGTGGGAATGTGGAAAGAAATTGGCGTCCTTTACACCTGTTATACAAGAC
CAGGATCTTGAAGTATTTGTGAGAGAGGTTGGGGACACTTCGTCTGACCTGCTGATTGGG
GCATTGAGTGATATGATGATAGACAGGCTGGGGTTAAGGGTGCAGTGGTCAGGGGTGGAC
ATTGTCTCCACACTTAGGGCTGCAGCGCCGAACTGCGAGGGGATCTTGAGTGCGGTTCTT
GAGGCAGTGGACAACTGGGTGGAGTTCAAAGGTTATGCTCTCTGTTATAGTAAGTCAAAG
GGGAAGGTGATGGTGCAGTCAAGTGGTGGTAAATTGAGACTGAAGGGCAGAACATGTGAG
GAGTTGACTAGGAAGGATGAATGCATCGAAGACATTGAGTAGTCTCCTGGCGATGGTTGG
CTCCCCCGGGGGGGCCCCCGGCGGGGGGTCCCCC
>NODE_7_length_554_cov_17.906137
ATTTATTTTGAGTCTTATGTGAAACCACGTGAAGGACCCCAATGTTCTTGTAGTCGCAAC
AAATGGTCTCACATAAGACTCAAAATAAATCTGCCTCATGAAATTGTCAACAGCATCACT
AGTGCTCACCACTCTTTCCTCCACTATGGGTTCATGTGTCCTACTGTGAGACAGCCTCAA
TTCAGATGATAACACAATGTAATGTTCCTCTCTTTTCCATTTCACAATATGTGAGACAAG
AGATAAGGCTTCACAGTTAACATCCAACGCAACACAGAGATCTAGGAATTTTATTCTAGG
TGACCACTTCATTTTGGTTGACGCTAGATCACTCATGAATGGCAATATGTGCTTCTCAAA
CACCGATGGGTACAGCCTTCTCAAAGAATGAATGATGTGATTCAAACCAACCCTATCCTC
TAATAGTTTTGATGCAGTTGGCTTTAAAGGAAAATAGTCACAAGGGTTATGCTTGAAAAA
ATCCAATACCTTAACTGTCTTAGGTTCCCCTAAGACCCATGCACCCAACTCTATTGCAGT
TGATAAGGAGATGCACATATAATCCCATAACAAGGG
>NODE_8_length_274_cov_16.138685
CCAAAATAAGTTGTCTTCCACTTTCACTCGAGGTGCGCAGAAATTGCTATCTGAAGCTAT
CAACAAGTCTGCATTCCAGAGCTCCATTGCATCTGGCTTTGTGGGGTTATGCAGAACATT
GGGTAGCAAATGTGTTCGGGGACCAAATAAGGAGAATCTGTATATTAAGTCCATTCAGTC
TCTGATTTCTGATGTCAAGGGAATCAAATTATTGACAAATTCTAATGGCATTCAGTATTG
GCGGGTTCCGCTAGAACTTAGAGATGGGAGTGGAAGTGAAAGTGTGGTCAGTTATT
I would like to plot the lengths similar to this:
kga1978 is offline   Reply With Quote