Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • JimC
    Member
    • Nov 2008
    • 10

    Mosaik Aligning with Solexa Reads

    I've used Mosaik tools quite sucessfully in the past and I'm having an issue that I want to ask for help on....

    I have _sequence.txt files which were sent to me from another lab for analysis, so I don't have access to the raw data or QC data. The reads are from an enriched library (Nimblegen, I believe; Exons from a region of interest).

    For a given lane of solexa data (36 nt, not paired end), I have 11.6 M reads. I use MosaikBuild to create the dat file, and then I align these reads to an artificial sequence which represents all the exons from the enrichment region. my Aligner parameters are:

    MosaikAligner -in lane4.dat -ia chip.dat -out lane4.align -hs 15 -mm 3 -p 7 -a all -m all -mhp 100

    the resulting output is the conundrum ...... Why would I be losing almost 60% of the reads to a hash failure??? and 30 more to filtering ???? I'm losing 90% of my sequence in this step. I've tried several samples from 2 different solexa runs and gotten the same result.

    All thoughts and comments are welcome !!

    Jim

    *******************
    - Using the following alignment algorithm: all positions
    - Using the following alignment mode: aligning reads to all possible locations
    - Using a maximum mismatch threshold of 3
    - Using a hash size of 15
    - Using 7 processors
    - Setting hash position threshold to 100

    Hashing reference sequence:
    100%[==========================================================================================] 621,565.7 ref bases/s in 5 s

    - loading reference sequence... finished.

    Aligning read library (11573312):
    100%[==============================================================================================] 12,524.9 reads/s in 15:24

    Alignment statistics:
    ===================================
    # failed hash: 6818036 (58.9 %)
    # filtered out: 3537110 (30.6 %)
    # unique: 343500 ( 3.0 %)
    # non-unique: 874666 ( 7.6 %)
    ---------------------------------------------
    total: 11573312
    total aligned: 1218166 (10.5 %)
  • bioinfosm
    Senior Member
    • Jan 2008
    • 483

    #2
    Did you try a different hash size? Do you have Solexa's Summary report (Eland's error values)?
    --
    bioinfosm

    Comment

    • seqfast
      Member
      • Aug 2008
      • 16

      #3
      i've played with Mosaik a bit and while perhaps already known, a hash failure basically means that you have no seeds/alignments at the hash length - lowering this will certainly get you less hash failures, but it seems there are larger issues here. the filtered out means that % that did pass the hash (aligned) have >3 errors at your read length.

      i'd try another aligner as well to make sure the reads aren't in bad shape.

      good luck

      Comment

      • MQ-BCBB
        Member
        • May 2009
        • 25

        #4
        JimC, I am having a similar problem, did you figure out why you were loosing most reads?
        Thanks!

        Comment

        • shahid.manzoor
          Junior Member
          • Jun 2009
          • 8

          #5
          I have bacteria data for illumina in scarf format which i convert into fastq format but by using Mosaik build command it give an error like -
          parsing paired-end/mate-pair FASTQ files:
          ERROR: The number of qualities (127) do not match the number of bases (75) in HWUSI-EAS1688_9337_FC618BE_1_1_1112_15990#CGATGT/1.

          so an body can help me what is this error and how it can remove.

          Comment

          • alig
            Member
            • Sep 2008
            • 44

            #6
            Was this problem ever solved? I have converted solexa reads using Maq sol2sanger & it adds an extra quality to each read. So then of course Mosaik complains that no. of qualities (66) do not match no. of bases (65)

            Can anyone help?

            Thank you alig

            Comment

            Latest Articles

            Collapse

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by SEQadmin2, 06-05-2026, 10:09 AM
            0 responses
            12 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-04-2026, 08:59 AM
            0 responses
            23 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-02-2026, 12:03 PM
            0 responses
            28 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-02-2026, 11:40 AM
            0 responses
            22 views
            0 reactions
            Last Post SEQadmin2  
            Working...