SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
To remove or to keep duplicates in alignment of NGS paired reads to a set of contigs misagh Bioinformatics 0 12-16-2014 06:14 PM
set allowed when aligning with BWA Gorbenzer Bioinformatics 2 04-27-2014 10:21 PM
PALmapper NGS mapping program, any thoughts? KevinLam Bioinformatics 2 02-27-2012 12:01 AM
NGS and MicroArray Data Analysis Program Biokart Events / Conferences 0 12-28-2011 02:54 AM
Where can I find public dataset of NGS? xinwu Bioinformatics 3 09-08-2010 12:11 PM

Reply
 
Thread Tools
Old 02-24-2015, 07:50 PM   #1
Alun3.1
Junior Member
 
Location: alberta

Join Date: Feb 2015
Posts: 8
Default Program for aligning particular set of reads to an entire NGS dataset

Hi,

I have about 100 cDNA sequences (let's call them "ref.") for which I would like to know how many reads from the original Illumina dataset (10 millions of reads; let's call them "reads") align to them fully (i.e. the entire ref. sequence is in the read; see "read1" below) or partially (see "reads 2, 3, 4" below) without gaps.

Example:
Code:
ref.                   AGTTCGGCCGCTCACCGCACCGTCACGCCATCCAGGCATC
read1  ATGCGCTAGCTAGCATAGTTCGGCCGCTCACCGCACCGTCACGCCATCCAGGCATCTTGGACCGCATAGCATC
read2              ATTAAGTTCGGCCGCTCACCGCACC
read3                                CCGCACCGTCACGCCATCCAGGCATCATGCGCGATCTCAGC
read4                        GCCGCTCACCGCACC
Is there any "mapping" program to do that?

Can I use Bowtie2 (although it seems a bit complicated to use when I look at the extensive list of the option arguments)? It seems like I would have to input one file containing all the sequences (ref. + reads), which would probably align all the sequences to each other and take ages?
Also should I used the raw reads (paired-end) or the merged+unmerged reads?

Thanks for your help !
Alun3.1 is offline   Reply With Quote
Old 02-26-2015, 10:29 AM   #2
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

bowtie2 is good. Yes there are lot of arguments but that is because different people want to do different things. For example in your case you will want to use the non-default '--local' mapping.

You will not input just one file. Instead you will create an index file for your reference(s) and then input the R1 and R2 read files separately.
westerman is offline   Reply With Quote
Old 02-26-2015, 05:08 PM   #3
Alun3.1
Junior Member
 
Location: alberta

Join Date: Feb 2015
Posts: 8
Default

Got it. Thanks westerman !
Alun3.1 is offline   Reply With Quote
Old 02-27-2015, 04:31 AM   #4
archana2287
Junior Member
 
Location: INDIA

Join Date: Feb 2015
Posts: 5
Default

i did mapping using tophat, where length of reference was minimum 150 bp and max 50,000bp (worked on approx 40,000 reference sequence separately). I mapped paired end reads collectively rather than separate. Both mapping could end up with slight or major difference in mapping (It should be bothered for short stretch reference where reference length is less than 300 bp (just hypothetical statement) . Doing mapping of paired end R1 and R2 seperately, will be followed by selecting those reads that mapped in both mapping ?? right ?? Now how we will encounter the insert size parameter ?? and how i can perform the local mapping in tophat ?? is there any way to do so ??
archana2287 is offline   Reply With Quote
Old 02-27-2015, 09:46 AM   #5
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Quote:
Originally Posted by archana2287 View Post
i did mapping using tophat, where length of reference was minimum 150 bp and max 50,000bp (worked on approx 40,000 reference sequence separately). I mapped paired end reads collectively rather than separate. Both mapping could end up with slight or major difference in mapping (It should be bothered for short stretch reference where reference length is less than 300 bp (just hypothetical statement) . Doing mapping of paired end R1 and R2 seperately, will be followed by selecting those reads that mapped in both mapping ?? right ?? Now how we will encounter the insert size parameter ?? and how i can perform the local mapping in tophat ?? is there any way to do so ??
This appears to have limited relevance to this thread, so I suggest you create a new thread to ask the question. And please take your time to phrase it clearly.
Brian Bushnell is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:15 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO