SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
A tool to assemble diploid gene sequences? ymc Bioinformatics 3 07-05-2012 08:46 PM
Assemble tools dicty Bioinformatics 8 02-23-2011 02:02 AM
de novo assemble of metagenomics sequences biohumin Bioinformatics 3 06-21-2010 11:52 PM
cufflinks assemble syslm01 Bioinformatics 0 05-05-2010 04:43 AM
Do we still need to assemble a genome? bp2010 Bioinformatics 6 01-28-2010 07:23 PM

Reply
 
Thread Tools
Old 09-28-2012, 02:45 AM   #1
hanshart
Member
 
Location: Germany

Join Date: Nov 2011
Posts: 27
Default assemble sequences to

Hi,
FASTQC can provide duplicate sequences during quality control of fastq data. But not all of them are really "unique" (=some of them are only partial sequences of a larger parental sequence (e.g. a contamination)). If I put all duplicate sequences in a text file (or FASTA), is it possible to "merge" those sequences by means of any software? thank you
(I allready tried Muscle but it seems to be not what i want)
hanshart is offline   Reply With Quote
Old 10-07-2012, 12:28 AM   #2
Torst
Senior Member
 
Location: The University of Melbourne, AUSTRALIA

Join Date: Apr 2008
Posts: 275
Default

Combining your sequences in this way will cause problems later on when de novo assembling or aligning, because it will change the coverage.

Why do you think it is 'contamination' ?
Torst is offline   Reply With Quote
Old 10-07-2012, 12:34 PM   #3
hanshart
Member
 
Location: Germany

Join Date: Nov 2011
Posts: 27
Default

Quote:
Originally Posted by Torst View Post
Combining your sequences in this way will cause problems later on when de novo assembling or aligning, because it will change the coverage.

Why do you think it is 'contamination' ?
Its rRNA. But many sequences occur more than once. Sequences that occur many times could be identified this way and filtered out in the next experiments. That's the idea behind the question.
hanshart is offline   Reply With Quote
Old 10-07-2012, 07:43 PM   #4
Torst
Senior Member
 
Location: The University of Melbourne, AUSTRALIA

Join Date: Apr 2008
Posts: 275
Default

So this is RNA-Seq data, not genomic DNA. If you know what species it is, align all the reads to the 8/16/23s rRNA sequence, and then pass the UNALIGNED reads to FastQC.
Torst is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:09 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO