Or, at least, that is what it looks like to me. If anyone could tell me what's going on here, I'd be grateful.
I have aligned the reads in the transcriptome to a reference genome using bwa and stored the results in a bam file. I am then wanting to go through and count how many reads of the reads are hitting each gene.
I attempt to access the mapped reads in the bam file like this:
and this works fine for most of the entries in the bam file. Rougly every 20th or so entry will throw up this:
Use of uninitialized value $start in subtraction (-) at /usr/local/lib64/perl5/Bio/DB/Sam.pm line 1452.
Use of uninitialized value $end in subtraction (-) at /usr/local/lib64/perl5/Bio/DB/Sam.pm line 1452.
The sequence ID that it then returns is the always the id of the last header in corresponding sam file. And the sequence returned is 'N'
I've had a look at the relevant part of Sam.pm
And the fact that it's returning N for the sequence makes sense to me. Not so sure about why it's returning that last header though. Or why the error is occuring in the first place. The iterator should only be getting sequence objects from the bam file. This makes it looks like it's picking up sequence objects from the bam file that aren't there and then throws an error when it can't fetch the sequence id or sequence. Can anyone enlightne me as to what's going on here? Is there anything else I can post here that would help understand the problem?
Cheers
Ben.
I have aligned the reads in the transcriptome to a reference genome using bwa and stored the results in a bam file. I am then wanting to go through and count how many reads of the reads are hitting each gene.
I attempt to access the mapped reads in the bam file like this:
Code:
my $sam = Bio::DB::Sam->new(-bam =>"SRR641211.bam"); my $iterator = $sam->features(-iterator => 1, -flags => {M_UNMAPPED=>0}); while (my $align = $iterator->next_seq) { print $align->seq->id."\n"; print $align->seq->seq."\n" }
Use of uninitialized value $start in subtraction (-) at /usr/local/lib64/perl5/Bio/DB/Sam.pm line 1452.
Use of uninitialized value $end in subtraction (-) at /usr/local/lib64/perl5/Bio/DB/Sam.pm line 1452.
The sequence ID that it then returns is the always the id of the last header in corresponding sam file. And the sequence returned is 'N'
I've had a look at the relevant part of Sam.pm
Code:
1446 sub can_do_seq { 1447 my $self = shift; 1448 my $obj = shift; 1449 return 1450 UNIVERSAL::can($obj,'seq') || 1451 UNIVERSAL::can($obj,'fetch_sequence'); 1452 } 1453 1454 1455 sub seq { 1456 my $self = shift; 1457 my ($seqid,$start,$end) = @_; 1458 my $fai = $self->fai or return 'N' x ($end-$start+1); 1459 return $fai->can('seq') ? $fai->seq($seqid,$start,$end) 1460 :$fai->can('fetch_sequence') ? $fai->fetch_sequence($seqid,$start,$end) 1461 :'N' x ($end-$start+1); 1462 }
Cheers
Ben.