SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Perl Script AdrianJ217 Bioinformatics 7 10-15-2012 05:58 AM
just perl script semna Bioinformatics 3 07-02-2011 09:42 AM
perl script bioenvisage Bioinformatics 5 02-01-2010 09:11 AM
perl script bioenvisage Bioinformatics 0 02-01-2010 08:23 AM
Perl script bioenvisage Bioinformatics 4 01-28-2010 01:25 PM

Reply
 
Thread Tools
Old 05-02-2014, 07:45 AM   #1
pony2001mx
Member
 
Location: china

Join Date: Aug 2013
Posts: 32
Default Perl script: Make Statistics Of Mirna Abundances For Many Samples

Dear All,

Actually I post the following question a few weeks ago on Biostars (https://www.biostars.org/p/97538/#97599). I got a very nice answer there, but it's perl one liner command. I am eager to sort out the problem with perl hash of hash. Could anyone here give me an answer?

I need to make statistics of mirna abundances for many samples. Below is an example.
Code:
SAMPLE    MIR    ABUNDANCE
sample1   mir1   30
sample1   mir3   100
sample1   mir4   120
sample2   mir1   40
sample2   mir2   200
sample3   mir1   190
......
I want to change the format to below.
Code:
          sample1    sample2    sample3
mir1      30           40         190
mir2      0            200         0
mir3      190          0           0
mir4      120          0           0
......
i tried to write perl hash of hash, but was stuck (see below). Could perl export teaches me with this? I greatly appreciate your help!!
Code:
open FH, '<', $ARGV[0] or die "open failed:$!";
my %h;
while (<>){
        my ($sample, $mir, $abun) = /(.+?)\t(.+)\t(.+)/;
        $h{$sample}{$mir} = $abun; 
}
foreach my $sample (keys %h){
        foreach my $mir (keys %{h{$sample}})
                print "   "      # i am stuck here. Need your help!
}
pony2001mx is offline   Reply With Quote
Old 05-02-2014, 01:19 PM   #2
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 667
Default

see if this works:

Code:
        my ($sample, $mir, $abun) = /(.+?)\t(.+)\t(.+)/;
        $h{$mir}{$sample} = $abun; 
}
foreach my $mir (sort keys %h){
        print "$mir\t";
        foreach my $sample (sort keys %{h{$mir}}){
                print "$$h{$mir}{$sample}\t;"
        }
        print "\n";
}

Last edited by mastal; 05-02-2014 at 01:21 PM.
mastal is offline   Reply With Quote
Old 05-03-2014, 06:54 PM   #3
pony2001mx
Member
 
Location: china

Join Date: Aug 2013
Posts: 32
Default

Hi mastal, I still have problem, but thanks a lot anyway.
pony2001mx is offline   Reply With Quote
Old 05-03-2014, 07:49 PM   #4
SNPsaurus
Registered Vendor
 
Location: Eugene, OR

Join Date: May 2013
Posts: 521
Default

What problem are you still having? I think mastal re-organized it correctly. You want to print a line that has a mir, and then the value for each sample. So you would definitely want to have the outer loop be mir, and the inner loop be sample. That way it prints the mir, then on the same line prints each of the sample values.

mastal's code may have some typos in it (with Perl, it is difficult to tell the difference between a typo and brilliant code, so I am not sure), but I edited it and it works:

Code:
foreach my $mir (sort keys %h){
        print "$mir\t";
        foreach my $sample (sort keys %{$h{$mir}}){ # changed h{$mir} to $h{$mir}
                print "$h{$mir}{$sample}\t"; # changed $$h to $h and \t;" to \t";
        }
        print "\n";
}

when I made a little tester it outputs this:
m1 1.1 1.2 1.3
m2 2.1 2.2 2.3
which is correct.
__________________
Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com
SNPsaurus is offline   Reply With Quote
Old 05-04-2014, 06:28 AM   #5
pony2001mx
Member
 
Location: china

Join Date: Aug 2013
Posts: 32
Default

Hi SNPSaurus,
Thanks a lot for your comments! Actually it's not so easy. If the input data is as follows, then it's ok.
Code:
sample1	mir1	1.1
sample1	mir2	1.2
sample2	mir1	2.1
sample2	mir2	2.2
However, if the input data changes to below, it won't be what i expect. The problem is MISSING VALUE.
Code:
sample1	mir1	1.1
sample1	mir2	1.2
sample2	mir1	2.1
sample2	mir2	2.2
sample3	mir4	3.1
i am a beginner and am learning perl. I tried best to write a script as follows (i add some comments for easier understanding), but still have problem. Could you please check please? I appreciate your helps!
Code:
#!/usr/bin/perl
use strict;
use warnings;

open FH, '<', $ARGV[0] || die "open failed $!";
my %h;
my %h2;
while (<FH>){
        my ($sample, $mir, $abun) = /(\S+?)\t(\S+)\t(\S+)/;
        $h{$mir}{$sample} = $abun; 
		$h2{$sample} +=1; #increament to calculate total samples
}

foreach my $sample_h2 (sort keys %h2){ #print sample header 
	print "\t$sample_h2";
}
print "\n";

foreach my $mir (sort keys %h){
    print "$mir\t";  #print mir name
	foreach my $sample2(sort keys %h2){ #sort according to sample header
		foreach my $sample (sort keys %{$h{$mir}}){  #search sample name in %h2 from that in %h
			if ($sample eq $sample2) {  
				print "$h{$mir}{$sample}\t"; #when matched print 
				last;
			}
		}
	}
	print "\n";
}

Last edited by pony2001mx; 05-04-2014 at 06:36 AM.
pony2001mx is offline   Reply With Quote
Old 05-04-2014, 10:59 AM   #6
SNPsaurus
Registered Vendor
 
Location: Eugene, OR

Join Date: May 2013
Posts: 521
Default

I think I see what you are trying to do. Some mir don't have data for all samples. So you construct a list of samples separate from the hash of hashes. You go through the hash of samples, and then go through the list of samples in your hash of hashes, and if they match you print. This is probably better done with an "exist" check, and a printing of a blank if not present:

Code:
foreach my $mir (sort keys %h){
    print "$mir\t";  #print mir name
	foreach my $sample2(sort keys %h2){ #sort according to sample header
		if (exists $h{$mir}{$sample2}) {
				print "$h{$mir}{$sample2}\t"; #if exists print 
		} else {
			print "\t"; # print a blank if that sample doesn't exist for that mir
		}
	}
	print "\n";
}
__________________
Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com
SNPsaurus is offline   Reply With Quote
Old 05-04-2014, 06:12 PM   #7
pony2001mx
Member
 
Location: china

Join Date: Aug 2013
Posts: 32
Default

Hi SNPSaurus, Thank you very much! It's really good stuff for me to learn. Thanks.
pony2001mx is offline   Reply With Quote
Reply

Tags
perl

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:42 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO