Hi Everyone,
I am in the midst of renaming some fasta headers to include the chromosomes they have been mapped to (from a tab delimited txt file). I don't have any problems setting up the hash for the scaffold and chromosome locations...the problem is that I have some short strings contained in longer strings (scaffold1 and scaffold10 and scaffold100). Usually I would add "$" to the end of my search term (/scaffold1$/) and thereby indicate the end of the search string, but I'm not sure how to make use of this when the search term is a variable ($hash{$scaff/$/}). Advice?
Below is my script so far...and TIA!
I am in the midst of renaming some fasta headers to include the chromosomes they have been mapped to (from a tab delimited txt file). I don't have any problems setting up the hash for the scaffold and chromosome locations...the problem is that I have some short strings contained in longer strings (scaffold1 and scaffold10 and scaffold100). Usually I would add "$" to the end of my search term (/scaffold1$/) and thereby indicate the end of the search string, but I'm not sure how to make use of this when the search term is a variable ($hash{$scaff/$/}). Advice?
Below is my script so far...and TIA!
Code:
#!/bin/bash/perl #mod-header2include-chrom.pl #This script is intended to read in a fasta file and a tab-delimited file and use the information from the tab-delimited file to modify the header. #In this case, we are appending to the header (e.g. "scaffold671") the chromosome to which it has been mapped and the number of genes on this scaffold. use strict; use warnings; open (DATA, "<genome-assoc-chromosomes.txt") or die "Could not open genome chromosome mapping data: $!\n"; open (FASTA, "<scafSeq.FG.fill") or die "Could not open Fasta file: $!\n"; my %hash; while (<DATA>) { chomp; if ($_ =~ "scafold"){ #skip header - scafold is spelled incorrectly on purpose; next; } else { my ($key, $chrom, $len, $gene) = split /\t/; $hash{$key} = $chrom; } } while (<FASTA>){ my $line = $_; chomp ($line); if ( defined $hash{$line} ) { print "$line-$hash{$line}"; } else { print $line; } }
Comment