Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • bowtie output to gff3

    Hello all,

    I have finished aligning about 50 samples to a reference genome using bowtie. Now, I need the output in GFF3 format. Is there anyway I can convert Bowtie output files to GFF3 without the need to redo the alignment with another software. I see there are many columns in common between the two formats.

    Any help is appreciated!

  • #2
    I'm not aware of a direct converter, but theoretically you could feed a BAM into BEDTools BamToBed and push the BED through GenomeTools bed_to_gff3.
    I would probably write a converter script myself though, since (possibly important) metadata will get lost on the way. I guess you have specific wishes for the GFF format as well.

    Comment


    • #3
      Thank you, I was thinking with the second option as i have some coding experience. But thought there might be something ready made

      Comment


      • #4
        What kind of tools produce gff3 as output ??

        Thanks

        Comment


        • #5
          Originally posted by Amative View Post
          What kind of tools produce gff3 as output ??

          Thanks
          Gene predictors and older short read aligners (pre-SAM). Many tools made within the framework of GMOD use GFF(2/3) to exchange functional information about genomes. You'll find more information on http://gmod.org/wiki/GFF.
          What I like with GFF is that many visualization software packages have basic support for it and that it is fairly easy to parse and produce, the main disadvantage is the lack of standardization - something that GTF seems to be more succesful at (though less generic than GFF). An advantage in comparison with the BED format is that you can create a feature hierarchy (with theoretically unlimited depth) fairly simple, though not all visualization tools support this.

          Comment


          • #6
            Thanks Arvid for the helpful link .
            Well, I asked the previous question for one reason. Up in my first post i mentioned that I have a group of samples (RNA-Seq short reads) That I doing some analysis on them. Now I am required to represent the alignment results using Gbrowse. When I started reading about Gbrowse they mentioned that it accepts gff3 format. So, Now i don't know what to do:
            1) Should I convert the alignment output i got from bowtie?
            2) Should I redo the alignment using another tool that produce gff3 so that i can use it with gbrowse right away? What are these tools?

            Any other suggestions are appreciated
            Thanks

            Comment


            • #7
              IMHO you do neither 1 nor 2, but instead display the BAM in GBrowse directly and/or use a bigWig track to display coverage when zoomed out. Importing GFF will make the database huge and probably lag down the user interface.

              It is a bit outdated, but probably the place to start: http://gmod.org/wiki/GBrowse_NGS_Tutorial; you could probably start at "Tell GBrowse About the SAM Files" since much of the manual steps they do before aren't applicable anymore (unmunging etc.) - just make sure you have a coordinate-sorted and indexed BAM, and you should be good to go.

              Comment


              • #8
                Thank you, I'll try your suggestion and get back to you!

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Essential Discoveries and Tools in Epitranscriptomics
                  by seqadmin




                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                  04-22-2024, 07:01 AM
                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Today, 08:47 AM
                0 responses
                12 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                60 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                59 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                54 views
                0 likes
                Last Post seqadmin  
                Working...
                X