Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Indexing with bowtie2

    Hello guys!

    I'm starting to work with bean's RNAseq, and I want to undertand what king of file I'll use the index to the reference genome. Will be the fasta file or GTF/GFF?
    I understood that the fasta file is used to create the index by bowtie2 and GFF/GTF is used to run in tophat, but both are the genome of reference?


    thanks for the attention!!

  • #2
    Indexes for the alignment are created using the fasta for the genome reference. This is generally a multi-fasta format file.

    At this point if you are starting new you may want to switch to HISAT2 which is a new algorithm from the same group that developed TopHat. They are recommending that users switch to this program instead.
    Last edited by GenoMax; 05-03-2016, 01:09 PM.

    Comment


    • #3
      Thanks Genomax, but this new algorithm HISAT2 is used to mapping next-generation sequencing reads mainly against the general human population.
      My work is with bean genome Phaseolus vulgaris

      Do u know for what the GTF/GFF file is used? To TopHat??

      Comment


      • #4
        The wording on the HISAT2 site makes it sound like it is meant for human data but I don't think that is the case. Since many people (perhaps the authors of HISAT2) work on human data that reference may have crept in the description. If you look at the paper this quote says

        HISAT supports genomes of any size, including those larger than 4 billion bases.
        GTF/GFF file is used to describe features (gene models, transcripts). If you are only interested in mapping against a defined transctiptome then you can provide TopHat with a GTF file.

        Comment


        • #5
          Thanks GenoMax

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM
          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          25 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          29 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          25 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          52 views
          0 likes
          Last Post seqadmin  
          Working...
          X