Seqanswers Leaderboard Ad

**mastal** · 02-18-2014, 01:26 PM

What error messages are you getting, which lines in the script are causing the problem?

I am not sure which of the two files your script is trying to parse, but in any case the lines in the sam file begin with the read ids 'HWI...' not '@'

Code:

 next if($_ =~ /^@/); ## remove the headers in sam file

so that could be one problem.

**bambus** · 02-19-2014, 12:31 AM

Hi Mastal,thank you for the reply.

Yes,the lines begin with 'HWI' as you mentioned because I modified the original SAM output file.In general the there are headers in Sam file,but it doesn't harm in anyway to the script.

For instance an error message I am getting

Global symbol "%seq" requires explicit package name at read_count.pl line 22.
Execution of read_count.pl aborted due to compilation errors.

**mastal** · 02-19-2014, 01:44 AM

The error may refer to the $seq{$_) in your print statement in the last line of the script.

There is no previous reference to a hash named 'seq' in your script, but you do have a local scalar variable named $seq (my $seq = $s[0]).

**bambus** · 02-19-2014, 01:58 AM

Okay,how can I solve this?

Can you please help me.

**boetsie** · 02-19-2014, 01:59 AM

The error is caused, because in the corresponding line you are refering to a hash ($seq{$_}), while you did not define a hash for seq in line 5;
my $seq = ();

In addition, your 'seq' hash is not filled anywhere.

To solve this, I would recommend the following pseudocode;

- Read your alignment, and store the results (as you did already)

- Read contig file and go through each contig like the code belowbelow;

Code:

open(IN,$file);
while(<IN>){
  chomp;
  $contigSeq.= $_ if(eof(IN));
  if (/\>(\S+)/ || eof(IN)){
     my $head=$1;
     if($contigSeq ne ''){
       #$contigSeq is the contig sequence, $prevhead is your contig
       my $len = length($contigSeq);
       #Now print the results
       print "$prevhead\t$len\t$hash{$prevhead}\t$contigSeq\n";
     }
     $prevhead = $head;
     $contigSeq='';
  }else{
     $contigSeq .= $_;
  }
}
close IN;

Originally posted by bambus View Post

Hi Mastal,thank you for the reply.

Yes,the lines begin with 'HWI' as you mentioned because I modified the original SAM output file.In general the there are headers in Sam file,but it doesn't harm in anyway to the script.

For instance an error message I am getting

Global symbol "%seq" requires explicit package name at read_count.pl line 22.
Execution of read_count.pl aborted due to compilation errors.

**bambus** · 02-19-2014, 02:09 AM

I am sorry,i am not clear.

So now how does this code works.Do I have to input two files (Mapped sam file to count contig frequency and contig file to extract the sequence)or?

**boetsie** · 02-19-2014, 02:16 AM

Originally posted by bambus View Post

I am sorry,i am not clear.

So now how does this code works.Do I have to input two files (Mapped sam file to count contig frequency and contig file to extract the sequence)or?

Yes. The first part (reading the mapped sam file) you should do yourself. But basically, this is what you already have. The second part (the code shown above) is for extracting the contig sequence.

**bambus** · 02-19-2014, 02:21 AM

Yes I have an ouput file named 'read_count' with "contig ID" Number of Unicals mapped".

Now I also want to print out the "list of Unicals(sequences) that were mapped" "contig sequence" "contig length".

So I tried to execute your code with two input files

1) read count
2) contig_file as follows

perl contig_seq.pl read count contig_file >seq.txt

but,it gives me no output.

**boetsie** · 02-19-2014, 02:22 AM

PM me your code and the two input files, so I can help you with that..

**bambus** · 02-19-2014, 02:38 AM

Originally posted by boetsie View Post

PM me your code and the two input files, so I can help you with that..

I am sorry I could attach only 1 file at a time

Attached Files

read_count.pl (297 Bytes, 26 views)

**boetsie** · 02-19-2014, 02:47 AM

I did not test it, but I think it would be something like this. Run it as;

perl script <samfile> <contigfile>

Code:

#!usr/bin/perl-w
use strict;
use warnings;

open(SAM,$ARGV[0]);
my %hash = ();
while(<SAM>){
  chomp;
  next if($_ =~ /^@/); ## remove the headers in sam file
  my @s =split;
  my $contig = $s[2];
  $hash{$contig}++;
}
close SAM;

open(CTG,$ARGV[1]);
my ($contigSeq,$prevhead) = ("","");
while(<CTG>){
  chomp;
  $contigSeq.= $_ if(eof(CTG));
  if (/\>(\S+)/ || eof(CTG)){
     my $head=$1;
     if($contigSeq ne ''){
       #$contigSeq is the contig sequence, $prevhead is your contig
       my $len = length($contigSeq);
       #Now print the results
       print "$prevhead\t$len\t$hash{$prevhead}\t$contigSeq\n";
     }
     $prevhead = $head;
     $contigSeq='';
  }else{
     $contigSeq .= $_;
  }
}
close CTG;

**bambus** · 02-19-2014, 02:48 AM

Originally posted by bambus View Post

I am sorry I could attach only 1 file at a time

input file 1 with mapped reads

Attached Files

Sample_aligned_reads.txt (19.0 KB, 23 views)

**bambus** · 02-19-2014, 02:49 AM

Originally posted by bambus View Post

input file 1 with mapped reads

Input file 2 -list of contigs

Attached Files

sample_contig.txt (2.1 KB, 28 views)

**bambus** · 02-19-2014, 02:53 AM

yes I did,and I got this following error many time with different line numbers

error

Use of uninitialized value in concatenation (.) or string at contig_table.pl line 27, <CTG> line 140568.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 24 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 25 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 21 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Metatranscriptomics data analysis:

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News