SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
Illumina Paired End FASTQ kjsalimian Bioinformatics 2 01-05-2012 01:19 PM
Why are Illumina paired-end SRA datasets made up of 3 FASTQ files? Bio.X2Y Illumina/Solexa 9 12-21-2010 12:36 PM
paired end fastq format in bwa Protaeus Bioinformatics 4 12-09-2010 03:28 PM
Visualization Tools for Large Datasets mrawlins Bioinformatics 4 04-28-2010 03:53 AM

Reply
 
Thread Tools
Old 03-17-2011, 12:47 AM   #1
sklages
Senior Member
 
Location: Berlin, DE

Join Date: May 2008
Posts: 628
Default Keep large paired-end Fastq datasets in sync

Hi Folks,

probably a common situation.

Having a Illumina PE dataset (qseq converted to fastq), I'd like to
remove some adapter sequences or clip low quality ends (e.g. with
the fastX toolkit).

Usually I end up having a dataset where there's not always a mate/
counterpart of read1 in read2 fastq, because most tools usually don't
care about pairs. OK.

Are there tools available for e.g. filling dummy sequences in positions
where there is the mate/counterpart missing? Or vice versa, remove
the single read if there is no mate/counterpart?

Or do I have to write it on my own? I am just curious ..
Having a bunch of HiSeq lanes makes this task tedious ;-)

Sven
sklages is offline   Reply With Quote
Old 03-17-2011, 02:13 AM   #2
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

There are scripts out there to do this kind of thing, e.g. my "Divide FASTQ file into paired and unpaired reads" tool which comes with a wrapper for Galaxy, see http://community.g2.bx.psu.edu/ - in this case it takes a single mixed FASTQ file and works out the pairs based on the read names.
maubp is offline   Reply With Quote
Old 03-17-2011, 03:11 AM   #3
sklages
Senior Member
 
Location: Berlin, DE

Join Date: May 2008
Posts: 628
Default

This tool is not found in the tools section, isn't it? I couldn't find it. Thanks.
sklages is offline   Reply With Quote
Old 03-17-2011, 03:17 AM   #4
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

It isn't part of the default install available at http://usegalaxy.org if that's what you mean. The Galaxy Tool Shed http://community.g2.bx.psu.edu/ is for people who have their own local Galaxy to add extra tools. As written this particular tool using the Galaxy library functions for file parsing, so it can't easily be used on its own. It could be modified to use Biopython for instance.
maubp is offline   Reply With Quote
Old 03-17-2011, 03:28 AM   #5
sklages
Senior Member
 
Location: Berlin, DE

Join Date: May 2008
Posts: 628
Default

Ah, ok, now I got it ;-) Thanks.
sklages is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:42 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO