Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Repetitive reads

    Hi,

    I just got data back from a Solexa run of a ChIP for a histone modification occuring in euchromatic regions of the genome. My tags, though, look like this:

    AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 200126
    CACACACACACACACACACACACACACACACACACA 9848
    CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 3245
    TGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTG 3211
    ACACACACACACACACACACACACACACACACACAC 2684
    GAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGA 1390
    AGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAG 1226
    GGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAG 1105
    TGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTT 905
    CACACACACACACACACACACACACACACACACACC 591
    GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGAAA 582
    CCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC 513
    GGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTA 446
    TGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTTTT 406
    CTAACCCTAACCCTAACCCTAACCCTAACCCTAACC 386
    CACACACACACACACACACACACACACACACACACT 373
    CACACACACACACACACACACACACACACACACAGA 322
    GGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTG 312
    CCCTAACCCTAACCCTAACCCTAACCCTAACCCTAA 299
    TGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTTTG 294
    GACAGACAGACAGACAGACAGACAGACAGACAGACA 290
    CACACACACACACACACACACACACACACACACAAA 290
    GGGGCAGAAGCTGCCTGAAAGGTGCTTGAGCAACGT 285
    TACACACACACACACACACACACACACACACACACA 268

    And go on like that. The majority of my reads are either are or are almost straight runs of a single base or or dinucleotides. Only 5.5% of my reads mapped to the genome at all, and those that did are not where I biologically expect them to be. Blatting a lot of them shows that they are usually in repetitive elements.

    Chromatin was fragmented by digestion to mononucleosomes, but the bioanalyzer trace shows the existence a of large peak at ~300 bp, which is larger than the expected peak at ~180.

    So I'm trying to figure out what when wrong with the run, so this can be avoided in the future. It looks like something's contaminating my sample, but blasting these reads doesn't show any obvious answers. Thoughts?

  • #2
    It appears there is one or more problems with the sequencing quality. I believe with Solexa that "A"s will be called when the flow cell is over illuminated or too oily and so on (User ScottC on this forum can probably explain this better!). That is why you have a large proportion of poly-A reads. You probably expect these in some proportion, but too many of them suggest a quality control issue.

    The repetitive dimers look very suspicious too. I don't have any explanations for those, though.

    Comment


    • #3
      We saw similar behavior from a recent ChipSEQ. Less than 10% aligned to the actual reference, while 1% or so aligned to human dna, 1% or so formed contigs that align to some bacteria, etc., 1% aligned to adapter/primer sequence, but no clue of the leftovers!

      Using 25bp reads for ChipSEQ data sounded more reasonable in our case, but still a huge amount of reads are not accounted for yet..
      --
      bioinfosm

      Comment


      • #4
        poly A artefact near the edge of a tile

        I came across this info in the release note of Maq 0.6.4.: "It is important to note that Illumina/Solexa sequencing may produce many false polyA at the edges of a tile. These polyA artefacts may greatly increase the running time of maq. Users are advised to remove these artefacts with their own scripts before alignment. For the moment maq does not provide a general functionality for filtering polyA."

        Does anyone know about the source of this artefact?

        Comment


        • #5
          Does SOLiD sequencing generate these artificial repetitive reads near the edge of a slide? We observed similar behavior from our SOLiD ChIPSeq project. We have no clue where went wrong.

          Comment


          • #6
            Hi jlli, I don't know the reason for Solid, but I did find an explanation for Solexa given by SillyPoint in the following thread. Maybe you have similar problems.
            Bridged amplification & clustering followed by sequencing by synthesis. (Genome Analyzer / HiSeq / MiSeq)

            Comment


            • #7
              Assuming you had a control lane on the FC how did it look - if you saw the same polyA problem this would argue for the oil theory. Was there one or more libraries constructed for the original sample? If the same result appeared in more than one library this would eliminate amplification bias.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:37 PM
              0 responses
              8 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 06:07 PM
              0 responses
              8 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              49 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              67 views
              0 likes
              Last Post seqadmin  
              Working...
              X