Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • 100% reads passed chastity filtering?

    Hi, folks.

    I'm on my fist QC analyses and when filtering for failed chastity reads the program retuned that 100% of the reads passed the test (used fastq_illumina_filter). I did not work on the sequencing (mammal on Illumina HiSeq, paired end), although i'm almost sure it was not processed before it fell on my lap. Could it be possible that that is indeed a true result? Or am i doing something wrong?

    "Processed 12,162,651 reads
    fastq_illumina_filter (--keep N) statistics:
    Input: 12,162,651 reads
    Output: 12,162,651 reads (100%)
    Processed 12,162,651 reads
    fastq_illumina_filter (--keep N) statistics:
    Input: 12,162,651 reads
    Output: 12,162,651 reads (100%)"

    Thanks in advance for the help.

  • #2
    I'm not familiar with fastq_Illumina_filter, but I think that only the reads that pass Illumina's filters get output as fastq files to send to the end users, so unless you are actually a service provider working with the sequencer, you probably wouldn't see any reads that don't pass the purity and chastity filters.

    If the reads haven't been processed further, then you can expect them tobe all the same length, and some will have N bases or adapter sequences. You can use FastQC to look at base qualities, etc.
    Last edited by mastal; 04-26-2016, 08:30 AM.

    Comment


    • #3
      Use FastQC for initial QC of your data. It is much more intuitive. Don't be afraid of red "x" showing up on the reports since they may be related to type of experiment you are doing.

      Take a look at this site for commentaries on possible results from FastQC.

      Comment


      • #4
        Thank you for your answers!

        I indeed did a FastQC analysis that returned some fair results. I was trying to filter the reads so I could improve the base quality scores (image below) following the recommended by Zhou and Rokas 2014. But as it seems the results could be really correct I guess i'll pass to the next level and start trimming the bad reads/3' tail.

        Attached Files

        Comment


        • #5
          I can't comment on 100% quality reads, but as you mentioned reads need 3' end trimming. Please post your qc plots after trimming, if you do not mind.
          Last edited by cpad0112; 04-27-2016, 07:48 AM. Reason: grammar

          Comment


          • #6
            Originally posted by cpad0112 View Post
            I can't comment on 100% quality reads, but as you mentioned reads need 3' end trimming. Please post your qc plots after trimming, if you do not mind.
            Hi!

            I finally managed to trim my sequences. The per base quality increased as expected, although I'm still getting some strange pattern on k-mer content. I'll post the results bellow (in order: original, adaper trim and quality trim for kmer-content and adapter trim and quality trim for base quality


            Click image for larger version

Name:	per_base_quality1.png
Views:	1
Size:	10.6 KB
ID:	305090
            Click image for larger version

Name:	per_base_quality2.png
Views:	1
Size:	9.9 KB
ID:	305094

            Click image for larger version

Name:	kmer_profiles0.png
Views:	1
Size:	89.1 KB
ID:	305093
            Click image for larger version

Name:	kmer_profiles1.png
Views:	1
Size:	64.4 KB
ID:	305091
            Click image for larger version

Name:	kmer_profiles2.png
Views:	1
Size:	80.1 KB
ID:	305092
            Last edited by savernake; 04-29-2016, 09:28 AM.

            Comment


            • #7
              Generally there is no need to worry about k-mer composition. If you are happy with the trimming results move forward with the rest of the analysis.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Essential Discoveries and Tools in Epitranscriptomics
                by seqadmin




                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                04-22-2024, 07:01 AM
              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Today, 08:47 AM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              60 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              57 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              53 views
              0 likes
              Last Post seqadmin  
              Working...
              X