SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
mapping 3'ends frobert Bioinformatics 0 02-14-2012 01:00 PM
unmapped reads assigned to chromosomes efoss Bioinformatics 4 12-01-2011 01:24 PM
Paired-end reads mapping to different chromosomes... rareaquaticbadger Bioinformatics 1 06-27-2011 04:24 AM
Exon-Junction mapping: re-assigning CDS-mapped reads to chromosomes sridharacharya RNA Sequencing 1 10-21-2010 04:07 PM
tophat paired reads mapped to different chromosomes wenhuang Bioinformatics 3 06-23-2010 05:23 AM

Reply
 
Thread Tools
Old 11-03-2011, 09:09 AM   #1
ETHANol
Senior Member
 
Location: Western Australia

Join Date: Feb 2010
Posts: 310
Default Reads mapping of ends of chromosomes

First, does anybody have a script that will trim reads that map off the ends of chromosomes in SAM files?

Second, in case anyone needs a perl script that will 'take care' of reads that map off the ends of chromosomes in BED files I wrote one. Seems like it works. Maybe trivial to some but not bad for a salty old bench scientist. I think it works but I am no scripting expert. If anyone has reason to believe otherwise I'd like to know.

Code:
#/usr/bin/perl
use strict; use warnings;

#bedEndRepair.pl by Ethan Ford.  Use at own risk!!!!!


#bedEndRepair.pl:  Takes reads that map off the end of chromosomes in a bed file and repositions them as a two nucleotide read at
#					end of the chromosome.

#Usage: 1) make a fasta index file with samtools with the following command: 'samtools faidx <ref.fasta>' or
#			make a 2 column tab deliminted text file. The first column contains the chromosome name written as it is in column 1 of your bed file, e.g. chr1,
#			and the second column has the length of the corresponding chromosome (and integer)

#		2) type: "perl bedEndRepair.pl path/to/your/indexfile.fai path/to/your/bedfile.bed" into the command line.
#		3) Output is saved in your current working dirctory with the name yourbedfile.repaired.bed

my $outfile = $ARGV[1];
$outfile =~ s/\.bed$/\.repaired.bed/;
open(BEDOUT, ">$outfile") or die("Failed to open output file");

open(INDEXFILE, $ARGV[0]) or die "Failed to open bed file";
open(BEDFILE, $ARGV[1]) or die "Failed to open index file";


# turns indexfile.fai into a hash
my %indexhash;

while (<INDEXFILE>) {	
	chomp;
	my ($indexchr, $indexlength) = split("\t", $_);
	$indexhash{$indexchr} = $indexlength;
};

while (<BEDFILE>) {
	chomp;
	(my ($chr), my ($start), my ($stop), my ($c4), my ($c5), my ($strand)) = split("\t");
		
	if ($start < 1) 
			{print BEDOUT $chr, "\t", '1', "\t", '2', "\t", $c4, "\t", $c5, "\t", $strand, "\n";}
			
	elsif ($stop > $indexhash{"$chr"})
			{print BEDOUT $chr, "\t", $indexhash{"$chr"}-1, "\t", $indexhash{"$chr"}, "\t", $c4, "\t", $c5, "\t", $strand, "\n";}
						
	else
			{print BEDOUT $chr, "\t", $start, "\t", $stop, "\t", $c4, "\t", $c5, "\t", $strand, "\n";}
							
						
};	
	
close (BEDFILE); close (INDEXFILE); close (BEDOUT);
exit;
__________________
--------------
Ethan
ETHANol is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:19 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO