SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
deciding 50bp single or paired end on Illumina Immunologist72 RNA Sequencing 0 10-31-2012 11:22 AM
duplicate reads in Illumina short, single end reads of RNAseq data inbarpl Bioinformatics 4 05-22-2012 08:36 AM
detect fusion gene from Solid single end 50bp reads( colorspace) yinxiaohe SOLiD 5 12-16-2011 04:39 AM
Fusion gene detect tools for Solid (colorspace)single end 50bp RNA-seq data yinxiaohe RNA Sequencing 3 08-22-2011 06:04 PM
illumina single-end reads run cufflink louis7781x Bioinformatics 3 04-23-2011 06:05 AM

Reply
 
Thread Tools
Old 09-04-2013, 07:44 AM   #1
rzeng
Member
 
Location: houston

Join Date: Aug 2013
Posts: 19
Default how to delete the all fastq reads which includes a potential 50bp Illumina Single End

HI,

I did Fastqc and found that a potential 50bp illumina single End PCR primer 1 sequence in my reads as followings

AGTTGATCCGGTCCTAGGCAGTGTAGATCTCGGTGGTCGCCGTATCATTA (100% over 30bp)

I checked my reads and found that this 50bp sequence locates on 5' of my reads that account 0.25% of all reads. (also some of my reads that there are GCGCA/GCTCAG/AACCG/AACAAAAGG sequence before this 50bp sequence too))

Since my reads are all 88bp length. I do not want to keep these reads even if I cut these 50bp sequence off.

Anyone know if there is any tools that can get rid of these reads who contain this 50bp sequence in the read? Or anyone has scripts or other ways to do this?
rzeng is offline   Reply With Quote
Old 09-04-2013, 07:48 AM   #2
rzeng
Member
 
Location: houston

Join Date: Aug 2013
Posts: 19
Default

My aim for above question is that I want to get rid of these reads which contain AGTTGATCCGGTCCTAGGCAGTGTAGATCTCGGTGGTCGCCGTATCATTA sequence. since the reads contained this 50bp sequence only account for 0.25%. Fastq toolkit trimmer or other tools can not help.
rzeng is offline   Reply With Quote
Old 09-04-2013, 08:13 AM   #3
rhinoceros
Senior Member
 
Location: sub-surface moon base

Join Date: Apr 2013
Posts: 372
Default

Map your reads against this sequence with something like bowtie2?
rhinoceros is offline   Reply With Quote
Old 09-04-2013, 08:25 AM   #4
rzeng
Member
 
Location: houston

Join Date: Aug 2013
Posts: 19
Default

No. I need to remove these reads which contain this 50bp sequence noisy from my library before I map them with BWA
rzeng is offline   Reply With Quote
Old 09-04-2013, 08:50 AM   #5
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

If the reads only account for 0.25% of total why are you worried about them?
GenoMax is offline   Reply With Quote
Old 09-04-2013, 09:00 AM   #6
kmcarr
Senior Member
 
Location: USA, Midwest

Join Date: May 2008
Posts: 1,178
Default

Use a read filtering/trimming application which includes adapter detection and removal. My choice is Trimmomatic.
kmcarr is offline   Reply With Quote
Old 09-04-2013, 09:02 AM   #7
dariober
Senior Member
 
Location: Cambridge, UK

Join Date: May 2010
Posts: 311
Default

Quote:
Originally Posted by rzeng View Post
My aim for above question is that I want to get rid of these reads which contain AGTTGATCCGGTCCTAGGCAGTGTAGATCTCGGTGGTCGCCGTATCATTA sequence. since the reads contained this 50bp sequence only account for 0.25%. Fastq toolkit trimmer or other tools can not help.
For example I would use cutadapt with "--discard-trimmed" option. Anyway, as GenoMax suggested you might ignore these reads which wouldn't align anyway if the adapter makes up a big chunk of the read.
Dario
dariober is offline   Reply With Quote
Old 09-05-2013, 10:13 AM   #8
FroggyFlox
Junior Member
 
Location: Florida, US

Join Date: Feb 2012
Posts: 4
Default

If you don't want to deal with Trimmomatic, you can also have a look at the Galaxy platform... Use the tool called 'Manipulate Fastq' which will give you the possibilty to select all the reads containing your sequence and do whatever you want with them, including deleting them.
https://main.g2.bx.psu.edu/
FroggyFlox is offline   Reply With Quote
Old 09-05-2013, 10:43 AM   #9
rzeng
Member
 
Location: houston

Join Date: Aug 2013
Posts: 19
Default

Thank you all the guys
rzeng is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:43 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO