Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • questions on the three index files for RNAseq analysis

    Hi every one,
    During preperation of the count file, I met three index files--- .bt2 file from bowtie2-build, .fa.fai file from samtools faidx, and .bam.bai file from samtools index.

    My first question is:
    Is there any one know the difference between them and their purposes?

    According to some tutorials, the first two are directedly used in the following commands; and as to the last, it is the .bam file that is used in the following command instead of it.

    Why the last one could not be used directedly in the following command?

    Thanks a lot!!

    Richard

  • #2
    The bt2 files are used by bowtie2. The .fai is just the index for the reference genome. I think tophat2 makes use of that, though I don't recall (it's very small, regardless). If you ever need to quickly extract a region of the genome, you can "samtools faidx genome.fa chrX:1000-20000", which uses the .fai file internally to know where to quickly seek. The .bai file is similar, but it allows samtools to quickly extract reads from a specified location in a BAM file. If you ever need to "samtools view foo.bam chrX:1000-20000" then you'll find that you need to the .bai file. Also, the .bai file is used by programs like IGV (for the same reasons, in fact).

    Comment


    • #3
      Hi Devon,
      Glad to meed you again. Thanks a lot!!
      Then, in the following commands, using the .sam file is right no matter what programs will be used. If they need the .bai file, they will find it internally, right?

      Comment


      • #4
        They'll find it if you're using the BAM file and they have a need for the index (which needs to be in the same directory as the BAM file to which it belongs). If a program just needs a SAM file (e.g., htseq-count), then it's not doing random access and won't bother with the index.

        Comment


        • #5
          Devon, thans! I got it.
          Another question:
          When I converted the sam files to the bam files, I used the fa.fai reference file in the command of "samtools import " instead of the fa referece file, because I did it before I got your answer. Do you think that It is ok for the convertion of the sam files, or I have to convert them using the fa reference file again?

          Comment


          • #6
            What version of samtools are you using that it still has the "import" command? That's extremely old and you'd be advised to upgrade. "samtools view -bS foo.sam > foo.bam" doesn't require any index files. I don't recall what the syntax was for the old import commands, that hasn't been used in forever :P

            Comment


            • #7
              I used samtools-0.1.17

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Essential Discoveries and Tools in Epitranscriptomics
                by seqadmin




                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                04-22-2024, 07:01 AM
              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 11:49 AM
              0 responses
              15 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-24-2024, 08:47 AM
              0 responses
              16 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              61 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              60 views
              0 likes
              Last Post seqadmin  
              Working...
              X