Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • COnverting collapsed reads to raw fasta

    Does anyone have a trick or script to convert collapsed reads back to raw fasta reads? I need them in raw fasta to be able to map with bowtie.
    a perl script maybe will do it?

  • #2
    What are collapsed reads?

    Comment


    • #3
      >1-1377297
      tgtaaacatcctcgactggaagct
      >2-783040
      tttggcaatggtagaactcacact
      >3-461345
      tagcttatcagactgatgttgaca

      Comment


      • #4
        Originally posted by Palgrave View Post
        >1-1377297
        tgtaaacatcctcgactggaagct
        >2-783040
        tttggcaatggtagaactcacact
        >3-461345
        tagcttatcagactgatgttgaca
        This looks like fasta to me, I'm still not sure what "collapsed reads" mean.
        You can add "-f" to your bowtie command for alignment of reads in fasta format.

        Comment


        • #5
          Originally posted by Palgrave View Post
          >1-1377297
          tgtaaacatcctcgactggaagct
          >2-783040
          tttggcaatggtagaactcacact
          >3-461345
          tagcttatcagactgatgttgaca
          I'm guessing that what you have are miRNA reads, with all identical sequences "collapsed" and your definition line means >[miRNA-id]-[counts].

          I'm also guessing that you are asking how to create a file that contains 1,377,297 copies of miRNA-1, 783,040 copies of miRNA-2, etc.

          Is that what you want to do? If it is, please don't. It would be a waste of time and cpu cycles to have Bowtie map exactly the same sequence 1,377,297 times. Map each unique sequence just once and then account for sequence abundance in your downstream analysis.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin


            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
            Yesterday, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          39 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          41 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          36 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          55 views
          0 likes
          Last Post seqadmin  
          Working...
          X