Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by kmcarr View Post
    I don't know that I can explain it but I can reassure you that what you are seeing is not out of the ordinary.

    1. In a decent run you will not see significant decrease in the median Q score over only 36 cycles. In recent versions of their base caller Illumina caps the Q score at 34 and it seems the majority of its bases meet this level. (I think they're being a little generous with themselves but that's just one man's opinion.)

    2. I very often see slight deviations in the base call composition over the first 2-3 cycles so you are not alone. There are two basic possibilities: the abnormal distribution is a true representation of the the DNA sequence meaning that the fragmentation or selection of fragments for library preparation is not strictly random; or the observation is due to some artifact of the sequencing or data analysis. I think the first possibility can be ruled out because I observed the bias in libraries fragmented both by nebulizer and by Covaris. It seem highly unlikely that two completely different fragmentation methods would produce the same non-random result. Also, I found that degree of bias in the first couple of cycles reduced dramatically in the shift from pipeline v1.3 to pipeline 1.4 (or maybe it was 1.4 to 1.5).

    I used to remove bases from the first 2 cycles as you did but I stopped doing this after the change in the pipeline mitigated the problem. Do you know which pipeline (or RTA) version was used to call the bases?
    Hi kmcarr, Thank you for your reply regarding my questions and apologies for the delayed response. According to the bioinformaticians that also initially quality checked our data this is the reply I received when I asked which RTA version was used:
    "The RTA version used to basecall your sequence was version 1.8.70.0. Pipeline software version 1.4 was last used in October 2009."
    They think that it might be due to library preparation and the fragmentase used to fragment the sample DNA. Anyhow, I trimmed those first few bases away which improved subsequent mapping, and glad to see that it is not out of the norm!

    Thanks,
    Laura

    Comment


    • #17
      /usr/local/bin/fastq_quality_boxplot_graph.sh [-i INPUT.TXT] [-t TITLE] [-p] [-o OUTPUT]

      Try to use the "-p" option and it will generate postscript file which can be opened with GSview.



      Originally posted by son_nexg View Post
      Thanks a lot, -Q option worked for me!!

      I am now onto the next step and generating the nucleotide distribution graph by doing:

      fastx_nucleotide_distribution_graph.sh \
      -i /PATH/Sample_xx.fastq.stats \
      -o /PATH/Sample_xx.fastq.stats_nuc.png \
      -t Sample_xx


      But getting the following error:

      line 0: undefined variable: vertical

      line 0: undefined variable: invert


      gnuplot> set style histogram rowstacked
      ^
      line 0: expecting 'data', 'function', 'line', 'fill' or 'arrow'


      gnuplot> set style data histograms
      ^
      line 0: expecting 'lines', 'points', 'linespoints', 'dots', 'impulses',
      'yerrorbars', 'xerrorbars', 'xyerrorbars', 'steps', 'fsteps',
      'histeps', 'filledcurves', 'boxes', 'boxerrorbars', 'boxxyerrorbars',
      'vectors', 'financebars', 'candlesticks', 'errorlines', 'xerrorlines',
      'yerrorlines', 'xyerrorlines', 'pm3d'

      line 0: undefined function: xtic


      These are the first 4 lines of my stats file:

      column count min max sum mean Q1 med Q3 IQR lW rW A_Count C_Count G_Count T_Count N_Count Max_count
      1 12192845 2 40 446955197 36.66 38 39 39 1 37 40 1290565 4112034 5828477 943901 17868 12192845
      2 12192845 2 40 441775514 36.23 37 39 39 2 34 40 5245648 1771654 1927735 3247808 0 12192845
      3 12192845 2 40 441662699 36.22 37 39 39 2 34 40 2061208 2422450 2202622 5506565 0 12192845


      Any guesses what might be happening with the input format?

      Cheers!!

      Comment


      • #18
        Hello,

        I am trying to download FASTx on a computer cluster and am running into a common problem "No package gtextutil-0.1" found. But when I try to export path to environmental path, i get errors of PKG_CONFIG_PATH: Undefined or errors of export not defined.

        What am I doing wrong? I am trying to configure the fastx_toolkit-0.0.13.2.tar.bz2 and libgtextutils-0.6.1.tar.bz2 downloads.

        PROBLEM SOLVED!! PLEASE ignore post.
        Last edited by htetre; 07-11-2013, 08:47 AM. Reason: problem solved

        Comment


        • #19
          Newbie: Per Base Quality Graphs

          I am a NGS newbie as well. I ran FastQC on my raw fastq files and see some that look perfect and some that have small dips in the middle. All stay mostly above 28 (box plot). Are both of these per base quality graphs acceptable? What might cause the dips at 17-19? See attached pdf.
          Attached Files

          Comment


          • #20
            @bruss. That sample 35d-1 does indicate problems. Dips in quality near the ends, especially the 3' end, are normal. But that dip in quality at around base 18 and again around base 42 is not good. So you are going to have some reads that are not usable. Depending on your next processing steps these reads may or may not give you problems; i.e., some programs will just ignore them which will be a good thing.

            Comment


            • #21
              Originally posted by Bruss View Post
              What might cause the dips at 17-19? See attached pdf.
              That *may* be indicative of some transient issue with the sequencer (e.g. small bubble(s) in the lane with your sample around those cycles).

              Comment


              • #22
                Thanks....I am running TopHat now after clipping off the barcode tags. I'll report back on how it works.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM
                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                18 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                22 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                16 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-04-2024, 09:00 AM
                0 responses
                47 views
                0 likes
                Last Post seqadmin  
                Working...
                X