SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Pindel - issue with converting raw output to VCF casshyr Bioinformatics 2 07-31-2014 09:53 AM
Converting FPKM from Cufflinks to raw counts for DESeq jebe Bioinformatics 34 02-05-2014 09:19 AM
count reads number in collapsed file lran2008 Bioinformatics 2 07-30-2013 11:34 AM
converting consensus fastq to fasta zlu Bioinformatics 18 08-17-2011 10:11 AM
cufflinks with collapsed reads files SongLi Bioinformatics 0 11-19-2010 12:10 PM

Reply
 
Thread Tools
Old 09-24-2014, 04:24 PM   #1
Palgrave
Member
 
Location: norway

Join Date: Aug 2011
Posts: 73
Default COnverting collapsed reads to raw fasta

Does anyone have a trick or script to convert collapsed reads back to raw fasta reads? I need them in raw fasta to be able to map with bowtie.
a perl script maybe will do it?
Palgrave is offline   Reply With Quote
Old 09-24-2014, 04:28 PM   #2
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

What are collapsed reads?
Brian Bushnell is offline   Reply With Quote
Old 09-24-2014, 04:30 PM   #3
Palgrave
Member
 
Location: norway

Join Date: Aug 2011
Posts: 73
Default

>1-1377297
tgtaaacatcctcgactggaagct
>2-783040
tttggcaatggtagaactcacact
>3-461345
tagcttatcagactgatgttgaca
Palgrave is offline   Reply With Quote
Old 09-24-2014, 09:59 PM   #4
yueluo
Member
 
Location: Guangzhou China

Join Date: Aug 2013
Posts: 82
Default

Quote:
Originally Posted by Palgrave View Post
>1-1377297
tgtaaacatcctcgactggaagct
>2-783040
tttggcaatggtagaactcacact
>3-461345
tagcttatcagactgatgttgaca
This looks like fasta to me, I'm still not sure what "collapsed reads" mean.
You can add "-f" to your bowtie command for alignment of reads in fasta format.
yueluo is offline   Reply With Quote
Old 09-25-2014, 07:58 AM   #5
kmcarr
Senior Member
 
Location: USA, Midwest

Join Date: May 2008
Posts: 1,170
Default

Quote:
Originally Posted by Palgrave View Post
>1-1377297
tgtaaacatcctcgactggaagct
>2-783040
tttggcaatggtagaactcacact
>3-461345
tagcttatcagactgatgttgaca
I'm guessing that what you have are miRNA reads, with all identical sequences "collapsed" and your definition line means >[miRNA-id]-[counts].

I'm also guessing that you are asking how to create a file that contains 1,377,297 copies of miRNA-1, 783,040 copies of miRNA-2, etc.

Is that what you want to do? If it is, please don't. It would be a waste of time and cpu cycles to have Bowtie map exactly the same sequence 1,377,297 times. Map each unique sequence just once and then account for sequence abundance in your downstream analysis.
kmcarr is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:42 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO