Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • runData/file/numberOfReads in 454NewblerMetrics.txt ?

    I don't understand why I get two values for 'numberOfReads' and 'numberOfBases' in the runData/file section of 454NewblerMetrics.txt:

    Code:
    /***************************************************************************
    **
    **      454 Life Sciences Corporation
    **         Newbler Metrics Results
    **
    **      Date of Assembly: 2012/09/18 13:55:53
    **      Project Directory: /home/training/pp-assembly-workshp-bk-noPE
    **      Software Release: 2.6 (20110517_1502)
    **
    ***************************************************************************/
    
    /*
    **  Input information.
    */
    
    runData
    {
    	file
    	{
    		path = "/home/training/Data/Newbler/BAC_Pool_3_SKS_SK35_Reads_RL5.sff";
    
    		numberOfReads = 281603, 237939;
    		numberOfBases = 110946882, 93033055;
    	}
    
    }
    
    /*
    **  Operation metrics.
    */
    
    runMetrics
    {
    	inputFileNumReads  = 281603; 
    	inputFileNumBases  = 110946882; 
    
    	totalNumberOfReads = 237939; 
    	totalNumberOfBases = 93033055; 
    ...
    I read the manual (around page 137 of Manual-v2.6_PartC_Assembly_and_Mapping_May2011.pdf), but it doesn't explain why multiple values would appear here.


    Thanks for any help,
    Dan.
    Homepage: Dan Bolser
    MetaBase the database of biological databases.

  • #2
    I think its <number of reads/bases>, <number of reads/bases after trimming>.

    Comment


    • #3
      Hmm... That's a good point, because I don't know where else the trimming / screening is reported.

      Shame documentation is lacking.
      Homepage: Dan Bolser
      MetaBase the database of biological databases.

      Comment


      • #4
        So what does equivalent set of numbers mean in the case of PE?

        Code:
        pairedReadData
        {
        	file
        	{
        		path = "/home/training/Data/Newbler/SK3911_PairedReadOne_HNFY28401.sff";
        
        		numberOfReads = 469316, 656913;
        		numberOfBases = 160822071, 126727498;
        		numWithPairedRead = 234009;
        	}
        
        }
        Homepage: Dan Bolser
        MetaBase the database of biological databases.

        Comment


        • #5
          I think its the same except each of the paired end reads is counted as a different trimmed read.

          Comment


          • #6
            I don't see how there could be 469k reads, 657k trimmed.
            Homepage: Dan Bolser
            MetaBase the database of biological databases.

            Comment


            • #7
              I'm not totally sure but I think read_1 & read_2 of a pair are counted individually for num. of trimmed reads.

              Comment


              • #8
                Weird! heh.
                Homepage: Dan Bolser
                MetaBase the database of biological databases.

                Comment


                • #9
                  Check this out for much more info:

                  Comment


                  • #10
                    That's really nice, but I still don't see it described
                    Homepage: Dan Bolser
                    MetaBase the database of biological databases.

                    Comment


                    • #11
                      With this post, I’ll start going through the output files newbler generates. Some of these will be described in detail as they contain a lot of important information. For today’s post, we’ll start …

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Essential Discoveries and Tools in Epitranscriptomics
                        by seqadmin




                        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                        Yesterday, 07:01 AM
                      • seqadmin
                        Current Approaches to Protein Sequencing
                        by seqadmin


                        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                        04-04-2024, 04:25 PM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 04-11-2024, 12:08 PM
                      0 responses
                      58 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 10:19 PM
                      0 responses
                      54 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 09:21 AM
                      0 responses
                      45 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-04-2024, 09:00 AM
                      0 responses
                      55 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X