SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Using MAQ to simulate illumina pair-end reads yjx1217 Bioinformatics 0 10-27-2011 08:38 AM
tracking Illumina read pair information SES Bioinformatics 1 10-18-2011 06:40 AM
Illumina Pair end primers and adaptors for multiplexing mimi_lupton Sample Prep / Library Generation 1 05-25-2011 07:57 AM
Illumina Pair end library protocol Berkeley2010 Illumina/Solexa 16 11-18-2010 06:53 AM
pair-end sequencing produces single-end read artifact pparg Bioinformatics 9 03-29-2010 11:15 AM

Reply
 
Thread Tools
Old 06-06-2012, 10:09 AM   #1
JDinis
Junior Member
 
Location: Wisconsin

Join Date: Mar 2012
Posts: 3
Default Illumina pair end read extraction.

Hi all,

I am working with a data set of 161x161 pair-end sequences generated on the Illumina MiSeq. In short, my goal is to extract all pair-end reads that overlap a 500 bp window of my assembly (easy part). I want to be able to show linkages of polymorphisms within this window.

The problem that I am having is that I also want to extract all paired reads that are associated to different polymorphisms, such as all reads that overlap positions 700, 750 and 1005 based on my MSA (hard part).

Is there a good program that can extract reads from MSA, SAM or BAM alignments based on the qualifier of nucleotide positions? If so what are your suggestions?

Thanks all for your time.

JD
JDinis is offline   Reply With Quote
Old 06-06-2012, 10:52 AM   #2
SeekAnswers
Member
 
Location: USA

Join Date: Mar 2012
Posts: 21
Default

You can specify the coordinates that you like to extract from your bam with samtools view and make a smaller bam file.

Then use something like bam2fastq to pull out the reads in that region.
SeekAnswers is offline   Reply With Quote
Old 06-06-2012, 11:04 AM   #3
JDinis
Junior Member
 
Location: Wisconsin

Join Date: Mar 2012
Posts: 3
Default

To my knowledge of samtools (correct me if I am wrong) that function only work when extracting a range, extract all reads from 400 to 900 nt based on a reference. This will not work for my application.

My goal is to extract paired reads that overlap coordinates 700 and 750 and 1000, it is similar in concept but a very different output. However, if I am wrong could you please let me know and could you please explain the syntax for such a search in samtools.

Thanks for the reply.
JDinis is offline   Reply With Quote
Old 06-06-2012, 08:00 PM   #4
Dario1984
Senior Member
 
Location: Sydney, Australia

Join Date: Jun 2011
Posts: 166
Default

Use the scanBam function from the R package Rsamtools. You can give it a set of ranges of width 1, using the what parameter to ScanBamParam.
Dario1984 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:49 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO