Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • adapter trimming - help

    Hello all,

    I am a newbie to NGS analysis. I recently got raw sequenced data (Illumina) for yeast.
    Read length is 104 bps. I have average knowledge in R. I used ShortRead package for initial processing of reads. However I have some doubts/queries regarding adapter trimming. I searched many threads here and came to know about lot of trimming tools including these. According to many Trimmomatic found to be the best. Now I want to use this tool on my data. But this tool requires a fasta file (The Adapter Fasta) containing adapter sequence. Now,

    1. How to know the adapter sequence??
    2. My data is paired end reads, in this case how do I proceed??

    How do I create this file?

    I know my questions are too lame/simple for this forum and I am extremely sorry for such noob questions.

    Help me.

    Thank you.

  • #2
    In the best case scenario, the people doing the sequencing experiment will inform you about the sequence of the adapters used during the experiment. However this is not always the case. What you can do if you do not have any information about the adapter sequence is to run a program like FastQC (http://www.bioinformatics.babraham.a...ojects/fastqc/) which will search your fastq files for a number of commonly used adapters. If your sequencing experiment was performed according to some standard protocol then the adapter sequence might very well be included in FastQC's list and if there is substantial adapter contamination in your data then this will be seen in the program report.

    Comment


    • #3
      Hello Sir,

      Thank you for the reply. I randomly extracted 1000 reads (fastq) and ran Fastqc on it.
      fastqc reported kmer contamination as fail and gave around 200> 5bp sequences.
      What do I do now?? Do I have consider all of them as adapter sequnence.

      And my data has come with this file barcodes.txt

      Control_1_1 CACTGT
      Control_1_2 ATTCCG
      Control_1_3 GCTACC
      Control_1_4 CGAAAC
      Mutant_3_1 GATGCT
      Mutant_3_2 AGCTAG
      Mutant_3_3 GGCCAC
      Mutant_3_4 ATTATA

      are these adapters??

      Again sorry for noob question!

      Thank you.

      Comment


      • #4
        You should probably use much more than 1000 reads for FastQC - the program is pretty fast, I would use the whole fastq files if they are not enormous. What you can look at in the FastQC output is the "Overrepresented sequences" section, rather than the k-mer content (which is harder to interpret). If there is adapter contamination the adapter sequences should probably show up here, and if its a sequence which is present in FastQC's list of common adapters the identity will be listed in the "Possible Source" column.

        And these barcode sequences are most likely the sample barcodes used for multiplexing (putting multiple samples into the same sequencing lane), and are not adapter sequences.

        Comment


        • #5
          I ran fastqc on larger fastq (>1M reads), and "there are no overrepresented sequences".
          Now will it be alright if I continue with the alignment to a reference genome??

          Thank you.

          Comment


          • #6
            Originally posted by gaffa View Post
            You should probably use much more than 1000 reads for FastQC - the program is pretty fast, I would use the whole fastq files if they are not enormous. What you can look at in the FastQC output is the "Overrepresented sequences" section, rather than the k-mer content (which is harder to interpret). If there is adapter contamination the adapter sequences should probably show up here, and if its a sequence which is present in FastQC's list of common adapters the identity will be listed in the "Possible Source" column.

            And these barcode sequences are most likely the sample barcodes used for multiplexing (putting multiple samples into the same sequencing lane), and are not adapter sequences.
            Won't that just look for adapter-dimer rather than adapter read-through?
            You will need to ask who prepared the libraries for the adapter sequence.

            Comment


            • #7
              Hello sir,

              apparently our samples were "outsourced" for sequencing and they have not given me the adapter sequences. I have mailed them regarding the same. Again thank you very much for the suggestions.

              I must say Seqanswers - hats off to you!! I am learning so much from this forum.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:37 PM
              0 responses
              11 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 06:07 PM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              51 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              67 views
              0 likes
              Last Post seqadmin  
              Working...
              X