Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Illumina solexa 75bp format problem

    I don't know why every read ends with 22 Ns. Please tell me.

    @HWI-EAS241:5:1:10:83#0/1
    GCCCCGTCCATCACTTCTGCGATGCCGCGAATGCCCAATGGCAAGCCGNCGGGNNNNNNNNNNNNNNNNNNNNNN
    +HWI-EAS241:5:1:10:83#0/1
    [a``_`X_O\Q\YQ[Z\O[a\WXNXZZBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
    @HWI-EAS241:5:1:10:1808#0/1
    TGCTGCGGCCCAATGGAGCCACGTTGCCCTGGTGCTTGCCCTTGGGATNGTGGNNNNNNNNNNNNNNNNNNNNNN
    +HWI-EAS241:5:1:10:1808#0/1
    [aaaaaaa\UX_aaa\U__`a`a`a_^Ua``P\a_aa_\TWa`BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
    @HWI-EAS241:5:1:10:1866#0/1
    TGGCCGCCTGCGTCACGCCGATTGTCAGCGCCGTGGGCCATGAAACCGNCGTGNNNNNNNNNNNNNNNNNNNNNN

  • #2
    There is something else funny in those records - the spaces in both the sequences and the quality strings. Are those spaces real, or some a cut & paste corruption, or quirk of the forum editor?
    Last edited by maubp; 08-28-2009, 05:11 AM. Reason: fixed typo

    Comment


    • #3
      When you say "every read" do you literally mean EVERY read? Is it the entire flow cell, one lane, part of lane? Did anything happen to the instrument between cycles 53-54, such as reagents being refilled or software restarted?

      Comment


      • #4
        Yes, every read. I used the velvet to assemble the genome. I did not know if it would affect the result of assemble. Whether should I remove the Ns first?

        Comment


        • #5
          The only time a N gets put into the sequence is when the base caller cannot match a cluster in the current tile. Typically this happens at the edge, when clusters "wander" on and off the image. Based on the fact your read quality went kaput in last 20-odd bases, I would guess one of the reagents ran out or was bad -- most likely the incorporation mix -- and you got no cluster illumination.

          You do need to trim the N's out before you put the sequences into velvet. Probably easiest to do by rerunning gerald with the USE_BASES param set to Y52n*.

          Edit: although it does occur to me the N followed by 4 called bases **might** indicate a laser issue -- highly unlikely, in my opinion, but you might want to discuss it with your FAS.
          Last edited by dcjamison; 08-31-2009, 06:53 AM.

          Comment


          • #6
            If you want to just edit the FASTQ file, here is a tiny Biopython script to do this for you (take just the first 52 bases of each read):

            Code:
            from Bio import SeqIO	 
            trimmed= (rec[:52] for rec in \	 
                      SeqIO.parse(open("original.fastq"), "fastq"))	 
            out_handle = open("trimmed.fastq", "w")	 
            SeqIO.write(trimmed, out_handle, "fastq")	 
            out_handle.close()
            That should work on Biopython 1.51 or later (and probably 1.50 from memory).

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM
            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 03-27-2024, 06:37 PM
            0 responses
            13 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-27-2024, 06:07 PM
            0 responses
            11 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-22-2024, 10:03 AM
            0 responses
            53 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-21-2024, 07:32 AM
            0 responses
            69 views
            0 likes
            Last Post seqadmin  
            Working...
            X