Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    You can avoid matching things like 'TRPS1' when using grep by doing something like

    Code:
    grep -P '\bRPS'

    Comment


    • #32
      Very interesting and informative. Thank you for sharing this information.

      Comment


      • #33
        Originally posted by mastal View Post
        You can avoid matching things like 'TRPS1' when using grep by doing something like

        Code:
        grep -P '\bRPS'
        "\bRPS" does not match MRPS10 and other ribosomal genes that do not begin with "RPS".

        grep matching on gene names can easily fail when your aim is to capture all genes in a particular class:
        1. You may miss some genes of interest that do not match your pattern.
        2. You may match genes that are not of interest, and it's hard to notice if your total list contains thousands of genes.


        If you aim to capture a just few genes, you can probably grep and manually confirm that you have everything you need.

        Visual inspection is not appropriate when you have thousands of genes: that is too many to count, and gene names have synonyms you will not recognize by visual inspection and not match with your patterns.

        Comment


        • #34
          Tophat remove rRNA reads automatically?

          Hi,
          I just have the opposite problem as mart555. I mapped my RNAseq to mouse gff3 file downloaded from ncbi. I didn't make any change, rRNA sequence are in the file. After mapping using tophat, I blasted the unmappd reads and many mapped to rRNA. The command I run is:
          tophat -p 8 -G $annotation -o out $database L1_1.fq.gz L1_2.fq.gz

          So my question is:
          (1) tophat can automatically filter rRNA reads even if the gff file has rRNA annotation?
          (2) I tried using only bowtie2 instead of tophat, the result is better, but in the unmapped reads they still map to rRNA. So bowtie2 can also filter some reads when indexing?

          Thank you.

          Comment


          • #35
            Hi all,

            I manually removed rRNA using the Linux command:

            $ grep -v 'rRNA' genes.gtf > new_genes.gtf

            One question though, if you are looking for gene differential expression between conditions (treatment vs control for example) shouldn't you remove tRNA as well?

            Cheers
            -G

            Comment


            • #36
              Technically you don't need to remove rRNA from annotation - you need to remove them from your library. And if it's poly-A selected, it will remove tRNA as well. If you see some differential expression of tRNAs, well, then you do it should not influence any other genes.

              Comment


              • #37
                If you have a gtf/gff of the rRNA genes, you can filter alignment results with Cufflinks using the '-M/--mask-file' option.

                http://cole-trapnell-lab.github.io/c...nks/index.html

                Tells Cufflinks to ignore all reads that could have come from transcripts in this GTF file. We recommend including any annotated rRNA, mitochondrial transcripts other abundant transcripts you wish to ignore in your analysis in this file. Due to variable efficiency of mRNA enrichment methods and rRNA depletion kits, masking these transcripts often improves the overall robustness of transcript abundance estimates.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Essential Discoveries and Tools in Epitranscriptomics
                  by seqadmin




                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                  04-22-2024, 07:01 AM
                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 11:49 AM
                0 responses
                15 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-24-2024, 08:47 AM
                0 responses
                16 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                61 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                60 views
                0 likes
                Last Post seqadmin  
                Working...
                X