SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > SOLiD



Similar Threads
Thread Thread Starter Forum Replies Last Post
cut fasta spike1985 General 1 02-14-2012 09:16 AM
about tophat how cut up into smallest length songyj Bioinformatics 1 10-18-2011 02:43 AM
23andMe cuts cost to $399! UPDATED...cut to $99 ECO Personalized Genomics 17 06-20-2011 07:51 PM
FPKM/RPKM cut-off question lewewoo RNA Sequencing 1 05-06-2011 12:54 AM
smaller fragments then cut out after gel purification susch Sample Prep / Library Generation 1 03-07-2011 06:22 AM

Reply
 
Thread Tools
Old 06-20-2010, 07:37 PM   #1
tiffany081126
Member
 
Location: Guangzhou, China

Join Date: Jun 2010
Posts: 10
Default software to cut adaptor

Hi all,

Since I am new here, I am not familiar with the data analysis from SOLiD platform.

Yesterday, I was told to cut the reads(50bp) into 30bp or shorter before performing mapping, otherwise I won't get anything since almost half of the read is from adaptor.

So, could you give me some suggestions about which software can do this work and what the adaptor should I prepare before?

Thanks a lot.

BTW, I am focus on miRNA sequencing. And yesterday I used PerM to mapping the reads with the genome reference, but I only got 0.03% mapping result.
tiffany081126 is offline   Reply With Quote
Old 06-20-2010, 08:25 PM   #2
ECO
--Site Admin--
 
Location: SF Bay Area, CA, USA

Join Date: Oct 2007
Posts: 1,358
Default

You need to know how the library was constructed to determine where to trim. Ideally whoever generated the library would know this information.
ECO is offline   Reply With Quote
Old 06-20-2010, 11:06 PM   #3
tiffany081126
Member
 
Location: Guangzhou, China

Join Date: Jun 2010
Posts: 10
Default

I just the person who make the construction, and already have the adaptor sequence, but what I don't know is that which software can do the trimming, or should I write a script with Perl to do so?
tiffany081126 is offline   Reply With Quote
Old 06-21-2010, 04:34 AM   #4
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 994
Default

If you want to write a script yourself, our HTSeq framework might be useful. It has functions to partially match an adapter sequence to a read and trim the read this way.

Here is how this would roughly look like:

Code:
import HTSeq

file_in = "yeast_RNASeq_excerpt_sequence.txt"
file_out = "trimmed.fastq"

adapter = HTSeq.Sequence( "ACCGTA" )
adapter_rc = adapter.get_reverse_complement()

fout = open( file_out, "w" )
for read in HTSeq.FastqReader( file_in ):
   read = read.trim_right_end( adapter )
   read.write_to_fastq_file( fout )
fout.close()
This is now for Sanger FASTQ, but I guess it should work with CSFASTQ as well.

Simon
Simon Anders is offline   Reply With Quote
Old 06-21-2010, 11:41 PM   #5
tiffany081126
Member
 
Location: Guangzhou, China

Join Date: Jun 2010
Posts: 10
Default

Hi, Simon

I was wondering whether I should change the format of csfasta to fastq before trimming, while your suggestion definitely helps me a lot.

Thanks a lot.

Tiffany
tiffany081126 is offline   Reply With Quote
Old 06-22-2010, 01:16 AM   #6
Tina
Junior Member
 
Location: oslo

Join Date: Jun 2009
Posts: 1
Default

If you have the adapter sequence with you then try using the program fastx_clipper which is a part of FASTX-toolkit.
http://hannonlab.cshl.edu/fastx_tool...mmandline.html

Hope this helps.
Tina is offline   Reply With Quote
Old 06-22-2010, 02:55 AM   #7
zee
NGS specialist
 
Location: Malaysia

Join Date: Apr 2008
Posts: 249
Default

Tiffany, my group are actively developing a new aligner, NovoalignCS, for colorspace and this would be a great feature to have. In fact we already have it in our NT space aligner.

We currently support iterative read trimming in cases where the adaptor is not known.

Are your adaptors in nucleotide space? I would be interested in obtaining some test data if that's possible and we could provide you with a beta version of the working program.


Quote:
Originally Posted by tiffany081126 View Post
Hi all,

Since I am new here, I am not familiar with the data analysis from SOLiD platform.

Yesterday, I was told to cut the reads(50bp) into 30bp or shorter before performing mapping, otherwise I won't get anything since almost half of the read is from adaptor.

So, could you give me some suggestions about which software can do this work and what the adaptor should I prepare before?

Thanks a lot.

BTW, I am focus on miRNA sequencing. And yesterday I used PerM to mapping the reads with the genome reference, but I only got 0.03% mapping result.
zee is offline   Reply With Quote
Old 06-22-2010, 05:44 AM   #8
Chipper
Senior Member
 
Location: Sweden

Join Date: Mar 2008
Posts: 324
Default

If you don't want to use another aligner you could just make a new file like this:
awk '{print substr($0,0,21)}' filename.csfasta > filename.csfasta.20

else try bowtie, you can either set it to use a short seed or trim ends.
Chipper is offline   Reply With Quote
Old 06-25-2010, 12:49 AM   #9
tiffany081126
Member
 
Location: Guangzhou, China

Join Date: Jun 2010
Posts: 10
Default

Quote:
Originally Posted by Tina View Post
If you have the adapter sequence with you then try using the program fastx_clipper which is a part of FASTX-toolkit.
http://hannonlab.cshl.edu/fastx_tool...mmandline.html

Hope this helps.
Tina,

Thanks a lot. I am using this toolkit these days.

Tiffany
tiffany081126 is offline   Reply With Quote
Old 06-25-2010, 12:54 AM   #10
tiffany081126
Member
 
Location: Guangzhou, China

Join Date: Jun 2010
Posts: 10
Default

Quote:
Originally Posted by zee View Post
Tiffany, my group are actively developing a new aligner, NovoalignCS, for colorspace and this would be a great feature to have. In fact we already have it in our NT space aligner.

We currently support iterative read trimming in cases where the adaptor is not known.

Are your adaptors in nucleotide space? I would be interested in obtaining some test data if that's possible and we could provide you with a beta version of the working program.
zee,

Thanks a lot. But I will try the toolkit first. Connect you later if it doesn't work.

Tiffany
tiffany081126 is offline   Reply With Quote
Old 06-25-2010, 12:58 AM   #11
tiffany081126
Member
 
Location: Guangzhou, China

Join Date: Jun 2010
Posts: 10
Default

Quote:
Originally Posted by Chipper View Post
If you don't want to use another aligner you could just make a new file like this:
awk '{print substr($0,0,21)}' filename.csfasta > filename.csfasta.20

else try bowtie, you can either set it to use a short seed or trim ends.
Chipper,

But I think it's too blind to do so. Any other good ideas to trim it according to the known sequence of the adaptor?

Nevertheless, thanks a lot for your suggestion.

Tiffany
tiffany081126 is offline   Reply With Quote
Old 08-11-2010, 06:00 AM   #12
patternist
Junior Member
 
Location: Maryland

Join Date: Jul 2009
Posts: 5
Default

mirTools web site provides a perl script for adaptor trimming.

http://centre.bioinformatics.zj.cn/m...daptortrim.php
patternist is offline   Reply With Quote
Old 08-14-2010, 07:30 AM   #13
tiffany081126
Member
 
Location: Guangzhou, China

Join Date: Jun 2010
Posts: 10
Default

Quote:
Originally Posted by patternist View Post
mirTools web site provides a perl script for adaptor trimming.

http://centre.bioinformatics.zj.cn/m...daptortrim.php
Thanks, patternist!

And I find this info is very useful for me.

Thanks a lot.

Tiffany
tiffany081126 is offline   Reply With Quote
Old 09-14-2010, 11:11 AM   #14
mmartin
Member
 
Location: Stockholm

Join Date: Aug 2009
Posts: 75
Default

I see requests for an adapter-removal software quite often and since fastx_clipper does not seem to be able to deal with color space data, I have now made the tool we use in our group available for download. Please have a look https://code.google.com/p/cutadapt/ and write me if you have any questions.
mmartin is offline   Reply With Quote
Old 09-14-2010, 11:46 PM   #15
tiffany081126
Member
 
Location: Guangzhou, China

Join Date: Jun 2010
Posts: 10
Default

Quote:
Originally Posted by mmartin View Post
I see requests for an adapter-removal software quite often and since fastx_clipper does not seem to be able to deal with color space data, I have now made the tool we use in our group available for download. Please have a look https://code.google.com/p/cutadapt/ and write me if you have any questions.
Thanks very much, but I have solve the problem by a script of my senior.

Thansks again.

Tiffany
tiffany081126 is offline   Reply With Quote
Old 09-16-2010, 03:12 PM   #16
RNAseqer
Member
 
Location: London

Join Date: Sep 2010
Posts: 22
Default

I've heard that trimming is not as efficient as alignment in the pipeline and then building up or down from there.
RNAseqer is offline   Reply With Quote
Old 09-16-2010, 03:38 PM   #17
mmartin
Member
 
Location: Stockholm

Join Date: Aug 2009
Posts: 75
Default

Could you elaborate? What do you mean by 'efficient'?
mmartin is offline   Reply With Quote
Old 09-16-2010, 09:07 PM   #18
snetmcom
Senior Member
 
Location: USA

Join Date: Oct 2008
Posts: 158
Default

i think the small rna pipeline will use the adapter sequence to more efficiently trim reads. It's better to run them as 35bp than to trim them and try and map.
snetmcom is offline   Reply With Quote
Old 09-17-2010, 12:44 AM   #19
mmartin
Member
 
Location: Stockholm

Join Date: Aug 2009
Posts: 75
Default

cutadapt and most trimmers I know do use the adapter sequence to trim the reads (they align the adapter sequence to the end of the read). The alternative is to unconditionally remove the last x bases (or colors) of each read. I do not consider that to be adapter removal, it is way too inaccurate.
mmartin is offline   Reply With Quote
Reply

Tags
50bp reads, adapter trimming, adaptor, mirna, perm, solid

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:05 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO