SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Cufflinks returning elevated FPKM values for 'new' transcripts gwilymh RNA Sequencing 0 12-31-2013 01:07 PM
BWA sam and Samtools sam->bam conversion problem maasha Bioinformatics 6 06-05-2013 07:39 AM
Issue with Sam-Bam conversion samtools - how to remove last line of Sam file? TabeaK Bioinformatics 3 11-19-2012 10:05 AM
GFF to GTF, and GTF to GRanges objects lewewoo Bioinformatics 2 04-03-2012 02:52 PM
Staden package question: is srf2fastq returning an error, or is this normal? mhayes Bioinformatics 4 12-12-2011 10:54 AM

Reply
 
Thread Tools
Old 02-12-2014, 03:22 PM   #1
tirohia
Member
 
Location: Auckland, NZ

Join Date: Nov 2011
Posts: 46
Default iterator in sam.pm returning nonexistant objects.

Or, at least, that is what it looks like to me. If anyone could tell me what's going on here, I'd be grateful.

I have aligned the reads in the transcriptome to a reference genome using bwa and stored the results in a bam file. I am then wanting to go through and count how many reads of the reads are hitting each gene.

I attempt to access the mapped reads in the bam file like this:

Code:
my $sam = Bio::DB::Sam->new(-bam  =>"SRR641211.bam");
my $iterator = $sam->features(-iterator => 1,  -flags    => {M_UNMAPPED=>0});

while (my $align = $iterator->next_seq) {
        print $align->seq->id."\n";
        print $align->seq->seq."\n"
}
and this works fine for most of the entries in the bam file. Rougly every 20th or so entry will throw up this:
Use of uninitialized value $start in subtraction (-) at /usr/local/lib64/perl5/Bio/DB/Sam.pm line 1452.
Use of uninitialized value $end in subtraction (-) at /usr/local/lib64/perl5/Bio/DB/Sam.pm line 1452.

The sequence ID that it then returns is the always the id of the last header in corresponding sam file. And the sequence returned is 'N'

I've had a look at the relevant part of Sam.pm

Code:
1446 sub can_do_seq {
1447     my $self = shift;
1448     my $obj  = shift;
1449     return
1450         UNIVERSAL::can($obj,'seq') ||
1451         UNIVERSAL::can($obj,'fetch_sequence');
1452 }
1453
1454
1455 sub seq {
1456     my $self = shift;
1457     my ($seqid,$start,$end) = @_;
1458     my $fai = $self->fai or return 'N' x ($end-$start+1);
1459     return $fai->can('seq')            ? $fai->seq($seqid,$start,$end)
1460           :$fai->can('fetch_sequence') ? $fai->fetch_sequence($seqid,$start,$end)
1461           :'N' x ($end-$start+1);
1462 }
And the fact that it's returning N for the sequence makes sense to me. Not so sure about why it's returning that last header though. Or why the error is occuring in the first place. The iterator should only be getting sequence objects from the bam file. This makes it looks like it's picking up sequence objects from the bam file that aren't there and then throws an error when it can't fetch the sequence id or sequence. Can anyone enlightne me as to what's going on here? Is there anything else I can post here that would help understand the problem?

Cheers
Ben.
tirohia is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:15 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO