Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • akashrestha
    Junior Member
    • Sep 2013
    • 7

    Overrepresented sequences in Genomic DNA sequence data from Illumina

    Good morning everyone,
    I am new to whole genome sequencing analysis, and if there is another thread for this type of problem, I will be grateful if you can provide it to me. Now a days I am working in comparative analysis of plant genome sequence (DNA). We received sequence data (paired-end) from ILLUMINA, used FASTQC to check the quality and found out > 0.20% overrepresented sequences (from True seq adapters). So, I am looking answers for some questions regarding those overrepresented sequences.
    1) I am wondering if I need to remove those overrepresented sequences from raw data of Genomic DNA sequences before proceeding to downward analysis ?
    2) If I removed it, there might be problem of unequal number of reads between the paired files (R1 and R2). And when trying to remove unpaired reads, we will remove big chunk of single reads from R1 and R2 files. Is there any way to use those single reads from both files that can incorporate in downward analysis, for instance, mapping with reference genome and annotation?

    Thank you in advance.
    akashrestha
  • mastal
    Senior Member
    • Mar 2009
    • 666

    #2
    0.2% is not a lot.

    Whether you remove adapters depends on what you are going to do with your data, it is more important if say, you don't have a reference genome and you're going to do de novo assembly.

    Depending on how many reads/what level of coverage you have, you can leave out reads that remain unpaired after trimming. Some software may be able to use both the paired and unpaired reads (in separate files).

    I like to use trimmomatic



    for removing adapters, but there are other programs.
    Trimmomatic will separate your trimmed reads into paired and unpaired.

    Comment

    • akashrestha
      Junior Member
      • Sep 2013
      • 7

      #3
      Thank you mastal for your reply,

      I am going to do comparative analysis of between the sequences to identify structural variations, indels and snps.

      You have mentioned that there are some software which can use paired and unpaired files seperately, could you please provide me the link of the software.

      Thanks.

      Comment

      • mastal
        Senior Member
        • Mar 2009
        • 666

        #4
        I was thinking of velvet, for de novo assembly.

        Other software will have their own particular requirements.

        Comment

        • akashrestha
          Junior Member
          • Sep 2013
          • 7

          #5
          I am going to do alignment with reference genome instead of velvet. So, is there any softwares that can use unpaired reads in addition to paired reads while conducting mapping with with reference genome.

          Comment

          • blancha
            Senior Member
            • May 2013
            • 367

            #6
            You have a whole tread on the subject of aligning paired and unpaired reads together with BWA on biostars.


            The gist is that you are making your life unnecessarily complicated.
            Just trim with Trimmomatic, and align the remaining paired reads.

            If you absolutely want to align the few unpaired reads remaining after trimming, you can do so following the instructions in the thread posted above. The benefit is dubious, however.

            Comment

            • Brian Bushnell
              Super Moderator
              • Jan 2014
              • 2709

              #7
              Most mapping programs work with either paired or unpaired reads. With BBMap, for example, you would run the program twice (once for paired reads, once for unpaired reads) and merge the resulting mapped output.

              However, there is no reason to have singletons left over after adapter-trimming. Adapter-trimming paired reads should yield paired reads of the same length, since if read 1 has adapter at position X, read 2 will also have adapter at position X. If you use BBDuk for trimming as at the top of this thread, you will not end up with any singletons.

              Comment

              Latest Articles

              Collapse

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, Today, 10:09 AM
              0 responses
              8 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, Yesterday, 08:59 AM
              0 responses
              14 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-02-2026, 12:03 PM
              0 responses
              23 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-02-2026, 11:40 AM
              0 responses
              20 views
              0 reactions
              Last Post SEQadmin2  
              Working...