Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Negative RPKM values by EDGE-pro

    Hello,

    I am using your EDGE-pro programs for RNA-seq analysis
    of bacterial species.
    I could run the program, but I got negative values for some of
    the samples.

    I found rRNA.numberReads was more than numberReads or
    numberUniqueReads, thus I got negative value for
    rpkm.numberReads.

    I tried to find the reason by reading the perl script, but I found
    counting of the number of reads was done by a binary code.

    I would appreciate it if you could tell me why and how to
    do to run the program.

    Thanks in advance,
    Hideaki

  • #2
    Why don't you send an email to the address listed on the EDGE-pro webpage?

    Comment


    • #3
      Hi dpryan,

      Thank you for your advice. Actually I did, but it didn't work.
      I would appreciate any idea.

      Thank you,
      Hideaki

      Comment


      • #4
        The only thing I can think of is that the RPKM values are log transformed, but the manual doesn't seem to mention that they're log transformed. When did you try emailing them? It's possible that they haven't had a chance to reply yet.

        Comment


        • #5
          Hi blakeoft,

          Thank you. As you pointed, I will wait the answer for my email.

          I tried to read the perl script, but the count was done by
          a binary code "count". So I need a help from a well experienced person.

          I have waited only for 3 days, so I will wait.

          Thank you,
          Hideaki

          Comment


          • #6
            You might have to give it a few weeks. Many authors are pretty bad at getting back to people in a timely manner.

            Comment


            • #7
              Originally posted by hi-koike View Post
              Hi blakeoft,

              Thank you. As you pointed, I will wait the answer for my email.

              I tried to read the perl script, but the count was done by
              a binary code "count". So I need a help from a well experienced person.

              I have waited only for 3 days, so I will wait.

              Thank you,
              Hideaki
              Finally you solved ?? I have the same problem.
              Regards

              Comment


              • #8
                I am also experiencing this same problem, however my .numberReads file (total number of reads) is a value smaller than that of the rRNA.numberReads file (number of reads mapped to rRNAs). I don't understand how this is possible as I would interpret 'total number of mapped reads' as including the reads from rRNA.

                Has anyone else had this issue at all, or an update with a way to resolve this? I have contacted the EDGE-pro developers and hope to hear back soon.

                Comment


                • #9
                  Try removing rRNA, tRNA, transposon reads.

                  In order to minimise the impact of the bug, try removing the reads matching rRNA, tRNA and transposase genes present in a genome of the analysed organism and rerunning the mapping/analysis.

                  Also check the definition of the rRNA regions (how the program gets those reads numbers).

                  So first make sure you have a multifasta file with rRNA (full rrn loci) and tRNA seqs for your organism, than you can use my tool as a simple contaminant remover:
                  (assuming adapters had been clipped already):

                  fastq_miseq_trimmer [R1.fastq.gz] [R2.fastq.gz] -base=[output_files_basename] -vs=[rtRNAs.fasta] -nohproc -noclip

                  Other processing options:
                  To remove illumina adapters: replace -noclip with -ts for Illumina TruSeq adapters clipping. To set quality clipping threshold add: -epc=0.5 (0.5=cumulative error probability in 10bp sliding window).

                  To set the ends to /1 /2, If your tool wants /1 or /2 as ends specifier, and replace ":" in a read_name with "_"
                  replace -nohproc with -ends=3

                  http://seqanswers.com/forums/attachm...3&d=1463489447

                  Comment


                  • #10
                    hi-koike was there a resolution of this issue? Or did you use the solution suggested by Markiyan?

                    I'm having the same problem and have emailed the authors but would like to know what, if any, approach worked for you.

                    Comment


                    • #11
                      Checking In

                      Hey, guys -

                      Pretty stumped at this as well: am in the process of trying to debug 'edge.pl' as well as 'count.cpp'.

                      The code is good but is pretty convoluted and not well documented: pretty standard scientific code, I suppose.

                      Interested in hearing about anyone else's success in figuring out where the bug is!

                      Cheers,

                      Timothy

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Essential Discoveries and Tools in Epitranscriptomics
                        by seqadmin


                        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
                        Yesterday, 07:01 AM
                      • seqadmin
                        Current Approaches to Protein Sequencing
                        by seqadmin


                        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                        04-04-2024, 04:25 PM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 04-11-2024, 12:08 PM
                      0 responses
                      43 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 10:19 PM
                      0 responses
                      43 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 09:21 AM
                      0 responses
                      38 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-04-2024, 09:00 AM
                      0 responses
                      55 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X