Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Why trim low quality reads

    Why should we remove low quality sequences initially, while we have the option to remove the low quality variants at the final stage from the vcf file?
    Thanks,

  • #2
    I would do so:
    - to save time by getting rid of them before the mapping and variant calling steps
    - to reduce noise globally: a wrong sequence can get aligned to a wrong position and alter the variant calling process

    Comment


    • #3
      Would it also significantly cut down processing time?

      Comment


      • #4
        Originally posted by Philcolson View Post
        Would it also significantly cut down processing time?
        This is what I meant by my first point because you reduce the amount of data to process (you don't have to align them for instance)

        Comment


        • #5
          Thanks Syfo,

          Can you tell me the best tool to trim low quality sequences of Illumina paired end data?
          Thanks,

          Comment


          • #6
            Hi,

            Like syfo mentioned, read trimming based on quality can be beneficial. However, with Illumina reads you would generally expect very good quality (except at the end of the read). So it might not need a separate read trimming step in your analysis.

            One thing you could anyways do is trim reads while aligning them. If you plan on using BWA for alignment, you can use the bwa aln "-q" parameter. This can easily be done for paired end data as explained in the BWA manual.

            To understand what the "q" parameter does, you could read the following post:

            Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


            Hopefully this way, you can bypass that separate read trimming phase and this should also answer the concern of improper read mapping and thus potential false positives in variant calls.

            I hope this helps.

            Praful

            Comment


            • #7
              Thanks aggp11,

              Actually, I have a few Illumina paired end samples whose quality drops to almost zero from the middle (50 bases). Does the -q 15 option of bwa wash out those sequences from aligning?

              Regards,
              Thanks,

              Comment


              • #8
                I am not sure but I think it should.

                Technically though, I'll be a little worried about samples whose reads quality just tanks midway and spend a little more time trouble shooting what might have gone wrong that you come across such a scenario.

                Did you happen to use FastQC to see how the sequencing runs look? If so, can you share the "per base quality" graph for any of the samples where the read quality goes down?

                Praful

                Comment


                • #9
                  Yes, please go through these graphs:
                  Click image for larger version

Name:	per_base_quality.png
Views:	2
Size:	10.9 KB
ID:	304006

                  Click image for larger version

Name:	per_base_quality1.png
Views:	2
Size:	11.6 KB
ID:	304007

                  Click image for larger version

Name:	per_base_quality2.png
Views:	2
Size:	11.4 KB
ID:	304008

                  Click image for larger version

Name:	per_base_quality3.png
Views:	2
Size:	11.4 KB
ID:	304009
                  Thanks,

                  Comment


                  • #10
                    Hi,

                    These don't look good at all. I don't know what might have caused this, but I think before thinking about trimming reads you should talk to someone at Illumina and also people who did the sequencing to try and figure what might have gone wrong. You could also post this on seqanswers as a separate blog and see if people have any suggestions.

                    Comment


                    • #11
                      These are the starting runs of illumina sequenced by our own department people and they don't know the cause. But from bioinformatics point of view, shall I trim them or just discard them?
                      Thanks,

                      Comment


                      • #12
                        In that case just I would suggest you just align them with say -q 15 and see what the aligner gives you.

                        Comment


                        • #13
                          Hi members,

                          Following up from the original title.. I have a question relating to mapping/alignment...which has to do with trimming as well..

                          The question is during mapping we are able to specify which phred score to choose from.. therefore am I correct to say that the low phred score nucleotides will not be mapped back to the reference sequence?? Then we do not need to trim the reads (assuming the low quality nucleotides are right at the end / beginning of a read [makes it easier to trim])

                          I think this is not true as I can see the low quality nucleotides in IGV, therefore the following question would be.. what is the use of specifying the phred quality score during mapping? if the low quality nucleotides are also being mapped??



                          Many thanks

                          Comment

                          Latest Articles

                          Collapse

                          • seqadmin
                            Current Approaches to Protein Sequencing
                            by seqadmin


                            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                            04-04-2024, 04:25 PM
                          • seqadmin
                            Strategies for Sequencing Challenging Samples
                            by seqadmin


                            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                            03-22-2024, 06:39 AM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by seqadmin, 04-11-2024, 12:08 PM
                          0 responses
                          23 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-10-2024, 10:19 PM
                          0 responses
                          24 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-10-2024, 09:21 AM
                          0 responses
                          21 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-04-2024, 09:00 AM
                          0 responses
                          52 views
                          0 likes
                          Last Post seqadmin  
                          Working...
                          X