Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Going from transcriptome to genome coordinates with a bam file

    I'm venturing into RNA-editing with mouse, and one of the most common methods to avoid false positives includes mapping to a transcriptome or a custom set of junction sequences followed by mapping to a genome. That's the easy part.

    Once I have the mapped reads for the transcriptome, does anyone know of a good tool/method to convert those coordinates to genome coordinates while updating CIGAR strings to show split reads?

  • #2
    What you are asking for, is essentially, another mapping, isn't it? There should be a map of transcriptome to your genome constructed first (maybe easier with a genome annotation available? ) and then you want to map your reads from transcriptome to that of the genome with the constructed map. This is the idea I could think of. Certainly seems a nice problem to invest some time for me. I am not aware of any existing tools that do this. I'll try to work a bit on this and see if I get anywhere and post back if I have something going on.

    Just 1 question, did you construct your transcriptome yourself (from an annotation file)? Or do you have a GFF file at all?

    Comment


    • #3
      Originally posted by cedance View Post
      What you are asking for, is essentially, another mapping, isn't it? There should be a map of transcriptome to your genome constructed first (maybe easier with a genome annotation available? ) and then you want to map your reads from transcriptome to that of the genome with the constructed map. This is the idea I could think of. Certainly seems a nice problem to invest some time for me. I am not aware of any existing tools that do this. I'll try to work a bit on this and see if I get anywhere and post back if I have something going on.

      Just 1 question, did you construct your transcriptome yourself (from an annotation file)? Or do you have a GFF file at all?
      It's essentially a solved problem since this is what mappers like Tophat do. However, my python skills are not sufficient to figure out how the code works so that I could implement it on a separate bam file. Any help you could provide would be great.
      I'm using mm9 with a UCSC gtf.

      Comment


      • #4
        Use Tophat. You will need a GTF file of exons, CDS, etc. Tophat will map to the known transcriptome then map to the rest of the genome.

        Comment


        • #5
          Originally posted by golharam View Post
          Use Tophat. You will need a GTF file of exons, CDS, etc. Tophat will map to the known transcriptome then map to the rest of the genome.
          Tophat's output isn't sufficient for what I want to do. One reason is that a common filtering step for RNA edits is based on MAPQ, which Tophat doesn't output in a manner correlating with quality.

          Comment


          • #6
            Excuse the bad terminology. The mapping I meant is not the generic "mapping" terminology associated with mapping reads to your genome. Rather a mapping in a "function" sense, or association, if you will.

            Rewriting, you'll need an association of every coordinate of your transcriptome to that of your genome. Imagine a read starting at chromosome "Chr1" and position "1500" and its CIGAR string is "80M". Imagine that, if you mapped to your reference genome, the read's CIGAR string would be "30M60N50M". This of course means that the read is spliced in this position. For you to be able to do this, the only way I could think of right now is for you to have known that 1500-1529 of your transcriptome corresponds to 1500-1529 of your reference genome. However, 1530-1579 of your transcriptome corresponds to 1530+60 = 1590 to 1639. Hence the need for association of transcriptome to genome.

            Going by this logic, if your GTF/GFF file for your transcriptome and genome have similar gene ids (or you know which RNA id of your transcriptome corresponds to which gene, and its coordinates), then, probably it might be possible to establish this association. In case I once again confused you or I understood it totally wrong, excuse the mess!

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM
            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 06:37 PM
            0 responses
            11 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 06:07 PM
            0 responses
            10 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-22-2024, 10:03 AM
            0 responses
            51 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-21-2024, 07:32 AM
            0 responses
            68 views
            0 likes
            Last Post seqadmin  
            Working...
            X