SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
paired-end adapter trimming vinay052003 Bioinformatics 16 05-02-2017 07:58 PM
Paired-end Illumina RNA-seq adapter trimming fabrice Bioinformatics 8 01-05-2015 07:48 AM
Illumina paired-end reads. More than 2 adapter sequences. RedLightPanic Illumina/Solexa 8 03-07-2013 12:27 PM
paired-end reads mapped to genome.. gene with only one direction of paired-end reads? danwiththeplan Bioinformatics 2 09-22-2011 02:06 AM
PerM is an ultra-fast and sensitive SOLiD reads mapping tool KevinLam Bioinformatics 7 06-18-2010 03:03 AM

Reply
 
Thread Tools
Old 09-29-2014, 08:55 AM   #41
relipmoc
Member
 
Location: Los Angeles, CA

Join Date: Jul 2011
Posts: 58
Default Corrected adapter sequences

The following adapter sequences are provided for your convenience:

>TruSeq read 1 universal adapter
AGATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNNATCTCGTATGCCGTCTTCTGCTTG

>TruSeq read 2 adapter
AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT
relipmoc is offline   Reply With Quote
Old 10-17-2014, 12:27 AM   #42
travc
Junior Member
 
Location: Davis, CA

Join Date: Aug 2013
Posts: 4
Default

I'm probably doing something dumb, but I'm tired of banging my head against this and will just ask instead...
Are the default adapter sequences for skewer correct for Nextera PE libraries?

I'm attempting to trim Nextera PE 150 sequences for readthrough (and quality).
This data has already been demultiplexed (with CASAVA by the sequencing center), so the leading adapters have already been removed.

I've already trimmed this data with trimmomatic and mapped it, so I know there is a lot of readthrough (very short insert sizes for this particular library). Otherwise, the quality is very good.
Trimmomatic only passed the forward read of >60% of the pairs (it sensibly drops the reverse on readthrough, since that reverse read contains no new data).
Quote:
Input Read Pairs: 13211394 Both Surviving: 4473051 (33.86%) Forward Only Surviving: 8561962 (64.81%) Reverse Only Surviving: 27843 (0.21%) Dropped: 148538 (1.12%)
So the problem is that skewer (with approximately the same quality filtering) is giving:
Quote:
$ skewer -t 30 -l 36 -q 3 -Q 15 foo1.fastq.gz foo2.fastq.gz
....
94344 ( 0.71%) read pairs filtered out by quality control
167205 ( 1.27%) short read pairs filtered out after trimming by size control
95108 ( 0.72%) empty read pairs filtered out after trimming by size control
12854737 (97.30%) read pairs available; of these:
5500721 (42.79%) trimmed read pairs available after processing
7354016 (57.21%) untrimmed read pairs available after processing
I've tried some other adapter sequences, which disturbingly seem to produce pretty much the same results. I guess I don't quite understand how one is supposed to tell skewer the proper sequence. Combine that with the fact that it is very difficult for me to find good documentation of the Nextera adapter sequences, and I'm just getting more and more confused. Another option is that I just don't understand what skewer is outputting.

Would it be possible for you (with help from users of course) to write up some brief usage examples for skewer for common situations like this one. Github has a very nice wiki feature...

Last edited by travc; 10-17-2014 at 12:33 AM.
travc is offline   Reply With Quote
Old 10-24-2014, 12:57 AM   #43
relipmoc
Member
 
Location: Los Angeles, CA

Join Date: Jul 2011
Posts: 58
Default

Hi travc,

The best scenario of using skewer is for pre-processing raw data, especially for pre-processing those data of complex libraries such as Nextera LMPs. To make things clearer, I suggest you to ask the sequencing center to send you the raw data without adapter trimming. Then you can use skewer to do adapter trimming by yourself.

To detect the readthroughs, skewer utilizes the reverse-complementary information of readthrough paired reads as well as the tailing adapter sequences. So feeding trimmed data to skewer will make it confused.

We will write a user manual in near future. Thank you for your suggestion!
relipmoc is offline   Reply With Quote
Old 10-24-2014, 08:21 AM   #44
travc
Junior Member
 
Location: Davis, CA

Join Date: Aug 2013
Posts: 4
Default

Thanks for that info. I guessed that it might need raw sequences, but wasn't sure.
travc is offline   Reply With Quote
Old 11-01-2014, 06:06 PM   #45
blsfoxfox
Member
 
Location: auburn

Join Date: Jan 2013
Posts: 12
Default

Quote:
Originally Posted by relipmoc View Post
Now skewer provides an option for it. Please download the updated version from http://sourceforge.net/projects/skewer.
Thanks for the good news! Is the option "-i, --intelligent For mate-pair mode, whether to redistribute reads based on junction information; (no)" ? So just keep it as default will avoid getting longer reads right?
blsfoxfox is offline   Reply With Quote
Old 11-13-2014, 07:58 AM   #46
relipmoc
Member
 
Location: Los Angeles, CA

Join Date: Jul 2011
Posts: 58
Default

Quote:
Originally Posted by blsfoxfox View Post
Thanks for the good news! Is the option "-i, --intelligent For mate-pair mode, whether to redistribute reads based on junction information; (no)" ? So just keep it as default will avoid getting longer reads right?
Yes, you are right!

BTW: I found that most of the users just use skewer for preprocessing Long Mate Pair reads. It seems that few people noticed the competence of skewer to trim adapters from small RNA sequencing data.
relipmoc is offline   Reply With Quote
Old 07-24-2015, 02:44 AM   #47
marghi
Member
 
Location: Germany

Join Date: Mar 2015
Posts: 10
Default

Hello.

First of all thank you for skewer, it's really amazingly fast!

I have some problems in trying to achieve a certain trimming behaviour with skewer, and I am wondering if I am missing something. I am aware that there are plenty of alternative ways to get what I want, but I'd like to know the correct skewer answer nevertheless.

I have 51bp Illumina reads barcoded with a3bp barcodes located at the 5' end of each read and I would like to use skewer to demultiplex the data into the single experiments. I did try the following command line (e.g. for barcode GAT):

skewer -m head -x GAT -r 0 -d 0 -k 3 small_test.fastq -o GAT_small_test

Which works fine as long as there is a GAT at the 5' end of the read. What I wasn't expecting is that in case there's no GAT at the beginning of the read but there is one inside the sequence, then the trimming is done. For example:

before trimming: ACTCAGCNGGAAAACCTCGCCCAGATTCAGGCGTGTAGTATGCCGTCTTCT
trimmed: TCAGGCGTGTAGTATGCCGTCTTCT

Is there any way to avoid this and really restrict the trimming to the 5'end?
marghi is offline   Reply With Quote
Old 07-25-2015, 08:05 AM   #48
relipmoc
Member
 
Location: Los Angeles, CA

Join Date: Jul 2011
Posts: 58
Default

Hi marghi,
Thank you for your feedback! This use case was not considered by skewer. We'll add the codes for processing it in the future.

Quote:
Originally Posted by marghi View Post
Hello.

First of all thank you for skewer, it's really amazingly fast!

I have some problems in trying to achieve a certain trimming behaviour with skewer, and I am wondering if I am missing something. I am aware that there are plenty of alternative ways to get what I want, but I'd like to know the correct skewer answer nevertheless.

I have 51bp Illumina reads barcoded with a3bp barcodes located at the 5' end of each read and I would like to use skewer to demultiplex the data into the single experiments. I did try the following command line (e.g. for barcode GAT):

skewer -m head -x GAT -r 0 -d 0 -k 3 small_test.fastq -o GAT_small_test

Which works fine as long as there is a GAT at the 5' end of the read. What I wasn't expecting is that in case there's no GAT at the beginning of the read but there is one inside the sequence, then the trimming is done. For example:

before trimming: ACTCAGCNGGAAAACCTCGCCCAGATTCAGGCGTGTAGTATGCCGTCTTCT
trimmed: TCAGGCGTGTAGTATGCCGTCTTCT

Is there any way to avoid this and really restrict the trimming to the 5'end?
relipmoc is offline   Reply With Quote
Old 07-27-2015, 12:16 AM   #49
marghi
Member
 
Location: Germany

Join Date: Mar 2015
Posts: 10
Default

Thank you very much!
marghi is offline   Reply With Quote
Old 08-04-2015, 06:52 PM   #50
relipmoc
Member
 
Location: Los Angeles, CA

Join Date: Jul 2011
Posts: 58
Default

Hi marghi,
Please download version 0.1.127 and run the following command:
Code:
$ skewer -m ap --barcode -x GAT -r 0 small_test.fastq -o GAT_small_test
You will get what you want.
relipmoc is offline   Reply With Quote
Old 08-07-2015, 12:19 AM   #51
marghi
Member
 
Location: Germany

Join Date: Mar 2015
Posts: 10
Default

Thank you so much! I will try right away.

Best regards
marghi is offline   Reply With Quote
Old 08-24-2015, 04:17 AM   #52
marghi
Member
 
Location: Germany

Join Date: Mar 2015
Posts: 10
Default

Dear Replimoc,

My apologies for coming up with this with so much delay, but I tried the command you recommended after upgrading to 0.1.127 and I get an error (segmentation fault). Do you have any idea of why this is happening?

Best regards
marghi is offline   Reply With Quote
Old 08-24-2015, 05:39 AM   #53
relipmoc
Member
 
Location: Los Angeles, CA

Join Date: Jul 2011
Posts: 58
Default

Quote:
Originally Posted by marghi View Post
Dear Replimoc,

My apologies for coming up with this with so much delay, but I tried the command you recommended after upgrading to 0.1.127 and I get an error (segmentation fault). Do you have any idea of why this is happening?

Best regards
Sorry for the inconvenience brought to you! Could you please send an email to me about the command you used as well as a minimum dataset that can cause the segmentation fault?
relipmoc is offline   Reply With Quote
Old 08-25-2015, 03:31 AM   #54
marghi
Member
 
Location: Germany

Join Date: Mar 2015
Posts: 10
Default

The command I used is simply the one you posted few messages ago, i.e.:
skewer -m ap --barcode -x GAT -r 0 small_test.fastq -o GAT_small_test

Any of the test sets I tried gives the error, so I assume it's the command itself. If this is not the case I can gladly share a sample set (to which email?).

Regards,
M.

Last edited by marghi; 08-25-2015 at 03:38 AM.
marghi is offline   Reply With Quote
Old 08-25-2015, 06:29 AM   #55
relipmoc
Member
 
Location: Los Angeles, CA

Join Date: Jul 2011
Posts: 58
Default

It runs well in my server. Could you please send the small_test.fastq to xxx@xxxxxx (see the skewer paper) Thank you!

BTW: are you sure that you used version 0.1.127? This version fixed some bugs in previous versions that can cause segmentation fault.

Quote:
Originally Posted by marghi View Post
The command I used is simply the one you posted few messages ago, i.e.:
skewer -m ap --barcode -x GAT -r 0 small_test.fastq -o GAT_small_test

Any of the test sets I tried gives the error, so I assume it's the command itself. If this is not the case I can gladly share a sample set (to which email?).

Regards,
M.

Last edited by relipmoc; 08-26-2015 at 09:20 PM.
relipmoc is offline   Reply With Quote
Old 08-28-2015, 06:28 AM   #56
marghi
Member
 
Location: Germany

Join Date: Mar 2015
Posts: 10
Default

Hi Replimoc,

Yes, I am sure I am using version 0.1.127 (Last update: August 5, 2015). I am sending via email the test set on which I get the seg fault and what causes it.

Thank you once again for your prompt support!
marghi is offline   Reply With Quote
Old 09-23-2015, 11:06 AM   #57
dkainer
Junior Member
 
Location: Australia

Join Date: May 2015
Posts: 9
Default

Can Skewer do the requested operation (i.e. only taking a barcode from the start of the read) in paired end mode (PE)? or only in amplicon mode (AP)?
dkainer is offline   Reply With Quote
Old 10-02-2015, 09:12 AM   #58
lunare
Junior Member
 
Location: ottawa

Join Date: Oct 2015
Posts: 1
Default Reads are longer than expected after using skewer with -i

Hello,

I recently started using skewer. Could you provide more information on what the -i option does? The documentation only says that it will "intelligently redistribute" the reads according to junction adapter information.

I am working with Nextera mate pair data, and I used the following command to remove the adapters:

./skewer-0.1.127-linux-x86_64 -m mp -i ~/1_1_1_3kb_R1.fastq ~/1_1_1_3kb_R2.fastq

After this a subset of my reads are longer than the initial size, I don't understand how that could be. I ran it without the -i and I don't get that problem.

Some clarification would be greatly appreciated.

-----I apologize I found that this was already answered----- I think the documentation should explain this more clearly.

Last edited by lunare; 10-05-2015 at 06:44 AM. Reason: This was already answered
lunare is offline   Reply With Quote
Old 11-24-2015, 03:17 AM   #59
dagarfield
Member
 
Location: Heidelberg, Germany

Join Date: Aug 2010
Posts: 39
Default The -m any option

Hello!

A quick question about your "-m any" option.
In rare instances, the Nextera kits can give you sequences that look a bit as follows:

ATTAAAAATTAAAAAGAAAAGGATTATAACCTTTATAAATGGGGTATGAACCCAGTAGCTTAATTAGCTTATCTTCTGTCTCTTATACACATCTGACGCCTGTCTCTTATACACATCTCCGAGCC

The adaptor sequence we're looking to trim is 'CTGTCTCTTATACACATCT'. Generally, this sequence occurs once, and the the '-m tail' (or PE option, I think) works fine. In this (rare) case, annoyingly, the Tn5 seems to have inserted twice here. Using the '-m any' option seems to do the trick -- both instances are remove along with all the 3' sequence. But what if the adaptor is found closer to the 5' end? Will the sequence trimmed always be 3'?

Cheers,

DG
dagarfield is offline   Reply With Quote
Old 03-19-2016, 07:23 AM   #60
rna_dna
Junior Member
 
Location: Washington, DC

Join Date: Mar 2015
Posts: 2
Default

Hello relipmoc

I am hoping someone is still monitoring this thread--it looks like no one has posted here in about 4 months.

I am having trouble using skewer and I could use some help. I will wait to see if this thread is still active before going into the issue.

Thanks.

Last edited by rna_dna; 03-19-2016 at 07:31 AM.
rna_dna is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:34 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO