SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Complete Genomics



Similar Threads
Thread Thread Starter Forum Replies Last Post
best way to re-sequence a few Mb of genomic sequences NanYu Genomic Resequencing 3 12-07-2014 05:05 PM
RRBS overrepresented sequences foehn Bioinformatics 9 09-13-2013 02:56 AM
Overrepresented sequences from FastQC report morning latte Bioinformatics 7 08-27-2013 08:31 AM
fastqc - overrepresented sequences PFS Bioinformatics 3 07-05-2011 06:18 PM

Reply
 
Thread Tools
Old 11-18-2015, 08:41 AM   #1
akashrestha
Junior Member
 
Location: USA

Join Date: Sep 2013
Posts: 7
Default Overrepresented sequences in Genomic DNA sequence data from Illumina

Good morning everyone,
I am new to whole genome sequencing analysis, and if there is another thread for this type of problem, I will be grateful if you can provide it to me. Now a days I am working in comparative analysis of plant genome sequence (DNA). We received sequence data (paired-end) from ILLUMINA, used FASTQC to check the quality and found out > 0.20% overrepresented sequences (from True seq adapters). So, I am looking answers for some questions regarding those overrepresented sequences.
1) I am wondering if I need to remove those overrepresented sequences from raw data of Genomic DNA sequences before proceeding to downward analysis ?
2) If I removed it, there might be problem of unequal number of reads between the paired files (R1 and R2). And when trying to remove unpaired reads, we will remove big chunk of single reads from R1 and R2 files. Is there any way to use those single reads from both files that can incorporate in downward analysis, for instance, mapping with reference genome and annotation?

Thank you in advance.
akashrestha
akashrestha is offline   Reply With Quote
Old 11-18-2015, 09:16 AM   #2
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 667
Default

0.2% is not a lot.

Whether you remove adapters depends on what you are going to do with your data, it is more important if say, you don't have a reference genome and you're going to do de novo assembly.

Depending on how many reads/what level of coverage you have, you can leave out reads that remain unpaired after trimming. Some software may be able to use both the paired and unpaired reads (in separate files).

I like to use trimmomatic

http://www.usadellab.org/cms/?page=trimmomatic

for removing adapters, but there are other programs.
Trimmomatic will separate your trimmed reads into paired and unpaired.
mastal is offline   Reply With Quote
Old 11-18-2015, 11:26 AM   #3
akashrestha
Junior Member
 
Location: USA

Join Date: Sep 2013
Posts: 7
Default

Thank you mastal for your reply,

I am going to do comparative analysis of between the sequences to identify structural variations, indels and snps.

You have mentioned that there are some software which can use paired and unpaired files seperately, could you please provide me the link of the software.

Thanks.
akashrestha is offline   Reply With Quote
Old 11-18-2015, 11:55 AM   #4
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 667
Default

I was thinking of velvet, for de novo assembly.

Other software will have their own particular requirements.
mastal is offline   Reply With Quote
Old 11-18-2015, 11:59 AM   #5
akashrestha
Junior Member
 
Location: USA

Join Date: Sep 2013
Posts: 7
Default

I am going to do alignment with reference genome instead of velvet. So, is there any softwares that can use unpaired reads in addition to paired reads while conducting mapping with with reference genome.
akashrestha is offline   Reply With Quote
Old 11-18-2015, 12:11 PM   #6
blancha
Senior Member
 
Location: Montreal

Join Date: May 2013
Posts: 367
Default

You have a whole tread on the subject of aligning paired and unpaired reads together with BWA on biostars.
https://www.biostars.org/p/140318/

The gist is that you are making your life unnecessarily complicated.
Just trim with Trimmomatic, and align the remaining paired reads.

If you absolutely want to align the few unpaired reads remaining after trimming, you can do so following the instructions in the thread posted above. The benefit is dubious, however.
blancha is offline   Reply With Quote
Old 11-18-2015, 12:14 PM   #7
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Most mapping programs work with either paired or unpaired reads. With BBMap, for example, you would run the program twice (once for paired reads, once for unpaired reads) and merge the resulting mapped output.

However, there is no reason to have singletons left over after adapter-trimming. Adapter-trimming paired reads should yield paired reads of the same length, since if read 1 has adapter at position X, read 2 will also have adapter at position X. If you use BBDuk for trimming as at the top of this thread, you will not end up with any singletons.
Brian Bushnell is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:19 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO