SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Control-FREEC: a tool for assessing copy number and allelic content using NGS data valeu Literature Watch 76 09-22-2016 02:16 AM
[Tool] Oncofuse: prediction of driver gene fusions from NGS data mikesh Bioinformatics 31 11-24-2014 09:07 AM
Custom reference when aligning male/ female NGS data to hg19 reference ron128 Bioinformatics 1 05-14-2013 05:09 AM
align RNA seq to coding region rururara Bioinformatics 0 02-09-2011 10:41 PM
Bowtie to align reads to single chromosome or region? jjw14 Bioinformatics 1 10-07-2010 07:20 PM

Reply
 
Thread Tools
Old 01-06-2014, 02:39 AM   #1
Gingeneticist
Junior Member
 
Location: London

Join Date: Jul 2013
Posts: 7
Question What tool can align a ton of NGS data from a large region to a tiny reference?

Hi all,

I have NGS data from a patient: Genomic DNA fragments ranging from 300-600bp sequenced on the MiSeq (2x250bp reads) prepared with SureSelect sample prep/target enrichment.

The DNA was enriched for a 4Mb region, but I have focussed in on a smaller anomaly and need a way of pulling out data from one small and specific area. Basically, I want to line up all the reads that contain a certain 37bp sequence and reject everything else.
So far I have tried two bits of software:
Nextgene - great for big alignments but not good for this task as I can't specify this small a region to align my data to
Sequencher - this should be able to do it in theory but can't seem to handle the amount of data I'm asking it to align and crashes


Would greatly appreciate all help and suggestions for what software I could try.

Best wishes,
Claire
Gingeneticist is offline   Reply With Quote
Old 01-06-2014, 03:11 AM   #2
boetsie
Senior Member
 
Location: NL, Leiden

Join Date: Feb 2010
Posts: 245
Default

If you just want an exact match of the 37bp, simply use 'grep'.

If mismatches should be allowed, I think you can do this with bowtie2. Make an index of your 37bp sequence and align the paired-reads with the --local command. Be careful though, the --local can also align smaller regions. You should probably filter those alignments.

Regards,
Boetsie

Last edited by boetsie; 01-06-2014 at 04:24 AM.
boetsie is offline   Reply With Quote
Old 01-06-2014, 03:43 AM   #3
Gingeneticist
Junior Member
 
Location: London

Join Date: Jul 2013
Posts: 7
Default

Thanks boetsie!
Gingeneticist is offline   Reply With Quote
Old 01-06-2014, 03:56 AM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,989
Default

Using "grep" (to identify the reads containing the exact sequence of interest) followed by alignment with Sequencher should work (at least on subsets of reads if there are millions).

You may want to do some kind of de-duplication (post "grep" selection and) prior to alignment (unless you are interested in counts) to simplify downstream processing/display.
GenoMax is offline   Reply With Quote
Old 01-06-2014, 04:35 AM   #5
JamieHeather
@jamimmunology
 
Location: London

Join Date: Nov 2012
Posts: 96
Default

For something like this, you can also use agrep to allow mismatches.
JamieHeather is offline   Reply With Quote
Reply

Tags
alignment problem, alignment tool, bioinformatic analaysis, ngs, variant discovery

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:00 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO