SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Filter paired end reads by ID in bam file Vca80553 Bioinformatics 0 09-16-2017 02:06 AM
PySAM VCF : add new filter geertvandeweyer Bioinformatics 0 04-18-2016 02:12 AM
Extracting insert size from .bam using pysam Tom.Booker Bioinformatics 3 10-27-2014 05:45 AM
Making bam files in Pysam Martino Bioinformatics 1 06-28-2012 12:09 AM
Filter paired & mapped reads from SAM/BAM file pirates.genome Bioinformatics 2 06-19-2012 01:21 AM

Reply
 
Thread Tools
Old 04-16-2019, 04:15 PM   #1
cmccabe
Senior Member
 
Location: chicago

Join Date: Jul 2012
Posts: 353
Default filter secondary reads in bam with pysam

I am new to python and trying to learn. The below is an attempt to filter out secondary reads in a all bam files in a directory using pysam. I ran mark duplicates and got an error on several bam files due to secondary alignments. I added comments to each line to illustrate my thought process. Thank you .

The specific mark duplicates error:

Value was put into PairInfoMap more than once

Code:
 
    #! /usr/bin/python  ## call python script
    import sys ## import python system functions
    import pysam  ## import module

    bam = pysam.AlignmentFile(.bam, "rb")  ## open bam and read
    click.echo("Reading BAM file")  ## output message
    for read in bam.fetch():
        if read.is_secondary=true  ## not the primary alignment
            continue
        if read.has_tag('AS') and read.has_tag('XS') ## look for Primary and Secondary Alignment Score tag
            AS = read.get_tag('AS')  ## read and store AS value
            XS = read.get_tag('XS')  ## read and store XS value
                 if AS >= XS  ## Alignment score greater then or equal to XS (these will be printed)
            bam.write(read)  ## open bam and only print primary alignments
                 continue  ## process next line
    bam.close()  ## end processing
another option
Code:
#! /usr/bin/python  ## call python script
import sys ## import python system functions
import pysam  ## import module

bam = pysam.AlignmentFile(bam, "rb")  ## open bam and read
click.echo("Reading BAM file")  ## output message
	for read in bam.fetch():  ## start loop and iterate over each bam
		if read.is_secondary:  ## not the primary alignment
			continue
		if read.has_tag('AS') and read.has_tag('XS') ## look for Primary and Secondary Alignment Score tag
			AS = read.get_tag('AS')  ## read and store AS value
			XS = read.get_tag('XS')  ## read and store XS value
				if AS >= XS  ## Alignment score greater then or equal to XS (these will be printed)
		read.write(read)  ## only print primary alignments
		continue
bam.close()  ## end processing

Last edited by cmccabe; 04-17-2019 at 04:44 AM. Reason: added option 2
cmccabe is offline   Reply With Quote
Reply

Tags
pysam, python

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:38 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO