SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Illumina/Solexa



Similar Threads
Thread Thread Starter Forum Replies Last Post
paired-end adapter trimming vinay052003 Bioinformatics 16 05-02-2017 07:58 PM
Paired-end Illumina RNA-seq adapter trimming fabrice Bioinformatics 8 01-05-2015 07:48 AM
FASTXtoolkit adapter trimming Mark Bioinformatics 36 10-24-2013 10:28 AM
3' Adapter Trimming caddymob Bioinformatics 0 05-27-2009 12:53 PM
Adapter trimming in MAQ for SOLiD lgoff Bioinformatics 0 05-11-2009 09:55 AM

Reply
 
Thread Tools
Old 07-14-2012, 02:53 AM   #1
figo1019
Member
 
Location: germany

Join Date: Jun 2012
Posts: 32
Default Illumina adapter trimming

Hi All,

I am a total newbies in this field. I have to assemble RNA seq data. Before that I need to trim the sequences. I have got 100bp illumina paired end reads in two files. I also got the adaptors sequences P5 and P7.
5-AATGATACGGCGACCACCGAGATCTACACGTTCAGAGTTCTACAGTCCGACGATC-(insert)-ACCTTAAGAGCCCACGGTTCCTTGAGGTCAGTGXXXXXXTAGAGCATACGGCAGAAGACGAAC-3

But when for example I use the grep -c 'AATGATACGGCGACCACCGAGATCTACACGTTCAGAGTTCTACAGTCCGACGATC' file_name to count the adapters.i cannot find a single one. I am totally a fresher if any one can help me out in detail. I tried to read the on the forums different answers but I am confused.

regards
figo1019 is offline   Reply With Quote
Old 07-14-2012, 05:28 AM   #2
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

You're pretty unlikely to find the entire adapter sequence in any of the reads. You'll want to look into something like cutadapt or trim_galore to make your life easier.
dpryan is offline   Reply With Quote
Old 07-15-2012, 11:01 AM   #3
figo1019
Member
 
Location: germany

Join Date: Jun 2012
Posts: 32
Default

Quote:
Originally Posted by dpryan View Post
You're pretty unlikely to find the entire adapter sequence in any of the reads. You'll want to look into something like cutadapt or trim_galore to make your life easier.
Hey Thanks dpryan ... I tried trim_galore today ... but still in the fastqc file I am getting these over represented sequences

ATGACACTCAAACAGGCATGCTCCACGGAATACCATGGAGCGCAAGGTGC 1155666 2.5956349017221085 No Hit
AATGACGCTCGAACAGGCATGCCCCTCGGAATACCAAGGGGCGCAATGTG 225179 0.5057538004361837 No Hit
AAGACACTCAAACAGGCATGCCTCTCGGAATACCAAGAGGCGCAAGGTGC 218636 0.4910581711090531 No Hit
GATCGTCGGACTGTAGAACTCTGAACGTGTAGATCTCGGTGGTCGCCGTATCATTAAAAA 119619 0.2686652123616139 Illumina RNA PCR Primer (100% over 50bp)
GATCGTCGGACTGTAGAACTCTGAACGTGTAGATCTCGGTGGTCGCCGTATCATTAAAAAA 111925 0.251384428005364 Illumina RNA PCR Primer (100% over 50bp)
AAATGACGCTCAAACAGGCATGCCCTTTGGAATACCAAAGGGCGCAATGT 104210 0.2340564774843778 No Hit
ACAAACCCTTGTGTCGAGGGCTGACTTTCAATAGATCGCAGCGAGGGAGC 71881 0.16144528987673504 No Hit
GATCGTCGGACTGTAGAACTCTGAACGTGTAGATCTCGGTGGTCGCCGTATCATTAAA 46463 0.10435626248303084 Illumina RNA PCR Primer (100% over 50bp)

So , do i need to remove all these also from my sequences. I hope i am not too much bugging you.

Regards
figo1019 is offline   Reply With Quote
Old 11-27-2013, 05:03 PM   #4
arcolombo698
Senior Member
 
Location: Los Angeles

Join Date: Nov 2013
Posts: 142
Default Adapter Trimming

Hello.

I have the same question.

FastQC can return the output of which sequences are overrepresented. Does this mean we need to removed?

How do you trim the adapters? You can use the ILLUMINACLIP but I don't know how to create the adapter.fa file.

Advice?

But this forum says that if you align with tophat you don't need to cut the adapters

http://seqanswers.com/forums/showthread.php?t=19799


"If you ignore the adapters , using the alignment in Tophat, actually filters the adapters out becuase
they are not in the transcriptome, so when you are aligning your sequence ot a trasncriptome, the adapters will not get aliged
because they are not in the transcriptome"
arcolombo698 is offline   Reply With Quote
Old 11-29-2013, 04:13 AM   #5
exo
Member
 
Location: Germany

Join Date: Dec 2012
Posts: 26
Default

I have a relatively dumb question. Doesnt the MiSeq have an integrated adaptor trimming option?
exo is offline   Reply With Quote
Old 12-02-2013, 09:50 AM   #6
microgirl123
Senior Member
 
Location: New England

Join Date: Jun 2012
Posts: 197
Default

The MiSeq has adapter trimming built in if you include it on the sample sheet. We generally do.
microgirl123 is offline   Reply With Quote
Old 04-17-2014, 10:08 AM   #7
cement_head
Senior Member
 
Location: Oxford, Ohio

Join Date: Mar 2012
Posts: 232
Default

Hello,

With the HiSeq 2000, what is the default for adaptor trimming? Is it "on" or "off" when generating FASTQ files?

Thanks
cement_head is offline   Reply With Quote
Old 04-17-2014, 11:46 AM   #8
blancha
Senior Member
 
Location: Montreal

Join Date: May 2013
Posts: 367
Default

To my knowledge, no trimming is performed by the HiSeq 2000. The HiSeq 2000 only calls the bases. Trimming the adapter sequences, if present, is a downstream step.

Our local sequencing centre, with many HiSeq 2000 machines, never trims the adapters at the level of the HiSeq 2000. They do the trimming later, if necessary, with Trimmomatic.
blancha is offline   Reply With Quote
Old 04-18-2014, 11:35 AM   #9
cement_head
Senior Member
 
Location: Oxford, Ohio

Join Date: Mar 2012
Posts: 232
Default

Ok, thanks. I called Illumina and the HiSeq 2000 machine can do trimming - it a CLI flag on the FASTQ generation.

It turns out the adaptors were not trimmed.

- Regards
cement_head is offline   Reply With Quote
Old 04-22-2014, 05:32 AM   #10
blancha
Senior Member
 
Location: Montreal

Join Date: May 2013
Posts: 367
Default

Good to know that the built-in software can do the trimming. I'd still rather have the raw data, and set the trimming parameters myself though.
blancha is offline   Reply With Quote
Old 04-22-2014, 11:09 AM   #11
kcchan
Senior Member
 
Location: USA

Join Date: Jul 2012
Posts: 182
Default

It's a feature that's been in CASAVA and BCL2FASTQ for a few years, but it's never worked really well.
kcchan is offline   Reply With Quote
Old 06-03-2014, 11:06 AM   #12
MalcolmHoutz
Junior Member
 
Location: Stamford, CT

Join Date: Feb 2014
Posts: 4
Default Trimmomatic: Which supplied illumina adapter file do I use?

Trimmomatic includes Illumina-supplied adapter fasta files:
NexteraPE-PE.fa
TruSeq2-SE.fa
TruSeq3-PE.fa
TruSeq2-PE.fa
TruSeq3-PE-2.fa
TruSeq3-SE.fa

I don't know which one to use. My data is paired end. When I asked the Primary Investigator, she gave me this response:

I'm not sure which of the adapter fa files it is. The index sequences are are from Epicenter: http://www.epibio.com/docs/default-s...s.pdf?sfvrsn=8 all are from set 1. As for the adapter sequences, they are from the "scriptseq kit".


I have been using TruSeq3-PE.fa, but only because I read this is common for recently sequenced data. I read in another forum TruSeq2-PE.fa is pretty generic, and should work. I am not sure what to do, and would appreciate some guidance. Thanks.
MalcolmHoutz is offline   Reply With Quote
Old 06-03-2014, 11:32 AM   #13
arcolombo698
Senior Member
 
Location: Los Angeles

Join Date: Nov 2013
Posts: 142
Default

Hi. Okay you are using Trimmomatic.

You first need to know which prep kit was used on the data. For my experiment we had used ILLUMINA prep kit that was found on their website and you can easily download the list of adapters used in the experiment because the covariate file has the prep kit name. We used the TruSeq2 Prep kit (if I remember correctly)

The thing to realize is to understand how trimming works.

There are 3 ' and 5' adapter sequences that attach to both ends. The universal adapter attaches to the 5' end of read 1 and read 1 also has the indexed adapter on the 3' end.

when read 1 is sequenced into the NGS machine, the machine detects the Universal adapter (because there is a primer attached onto the universal adapter) and read 1 skips the universal adapter, and the actual read 1 is everything in the flow cell lane that is after the universal adapter (i.e. <read 1 content><adapter region>

Then since this is paired end data, the second read 2 is sequenced, and the second read ends up with the reverse complement of the universal adapter. So if you know the universal adapter used in the experiment, merely calculate the reverse compliment and enter that into the TruSeq-2.fa if it is not already there.


Now how to remove the universal adapter?
Well read 2 is generated by reading the opposite direction 5' --> 3' and now the indexed adapter is detected by the machine and skips it. So the read 2 contains the fragment content and also the reverse complement of the universal adapter.

So all you need to do when using trimmomatic is
1) make sure that trimmomatic removes all the content that FOLLOWS the match, and not the exact match itself
2) find the common index for all the indexed adapters and enter that into the adapter.fa file
3) enter the reverse complement of the universal adapter into the adapter.fa file.

Check the alignment files after trimming.
arcolombo698 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:38 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO