Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Adding XS flag to Novoalign reads

    Hi folks, here is the scenario -

    I have unpaired, unstranded reads that have already been aligned using Novoalign by another lab, and I have the resulting BAM files. Our pipeline uses GSANP/cufflinks, and I want to compare DE results for the two pipelines. (I also have the reads from the other lab, and aligned using GSNAP.)

    Cufflinks wants to see the XS flag for spliced reads, which are not generated by Novoalign, and throws an error for such reads, apparently ignoring them.

    Nonetheless, the FPKMs found for most genes actually compare very closely for the two pipelines, which is encouraging - but there are also some big discrepancies, so I want to add the XS flag to the Novoalign reads and make sure they are being counted. (I should add that for this test I am comparing the same sample using output from the two different aligners, which used the same reference genome; so in principle all FPKMs should be close.)

    I tried bamutils (-xs option) for this - but it adds the XS tag to ALL Novoalign reads, and also tinkers with the some of the other tags - now the FPKMs are wildly different from GSNAP! A quick test where I kept just the unspliced reads from both pipelines shows that either cufflinks does not know how to handle the XS flag when applied to all reads, or the other changes made by bamutils have screwed things up.

    Anyone have experience with this, or suggestions for another tool to add the XS flag? I can probably figure it out myself, but I already invested a lot more time in this than expected (of course that is not unexpected, is it?)

    Thanks!

    Randy

  • #2
    An idea that occurs to me is to use a regex to detect splice junctions, and append the tag to all lines that match it.

    Comment


    • #3
      Yes, that is true - I can just look for the spliced aligns, and use the SAM flag to generate the XS tag, trusting that Novoalign made use of the splice signal info.

      I am curious though why the addition by bamutiils so completely bollocksed things up. The other changes to the tags look innocuous (data format changes, 'A' instead of 'Z' for a single character chromosome name).

      Thanks Adam!

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin




        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
        04-22-2024, 07:01 AM
      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      59 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      57 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      51 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      56 views
      0 likes
      Last Post seqadmin  
      Working...
      X