Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • still got Maq problems

    I am still having difficulty running Maq.

    I tried to use S. Typhi CT18 finished sequence in a fasta file & map reads from the file 'Typhi_CT18_solexa.fastq' from Sanger's ftp site using Maq version 0.6.8

    But I get the following output which doesn't seem to map any of the reads to the refseq sequence.

    maq.pl easyrun -d outdir S_typhiCT18.fasta Typhi_CT18_solexa.fastq
    -- CMD: /Users/jozefgecz/Documents/Alison/Maq/maq-0.6.8/maq fasta2bfa
    /Users/jozefgecz/Documents/Alison/Maq/maq-0.6.8/S_typhiCT18.fasta outdir/ref.bfa
    2> /dev/null
    -- CMD: /Users/jozefgecz/Documents/Alison/Maq/maq-0.6.8/maq fastq2bfq -n 2000000
    /Users/jozefgecz/Documents/Alison/Maq/maq-0.6.8/Typhi_CT18_solexa.fastq
    outdir/read1
    -- finish writing file 'outdir/[email protected]'
    -- 1879809 sequences were loaded.
    -- CMD: (cd outdir; /Users/jozefgecz/Documents/Alison/Maq/maq-0.6.8/maq map -n
    2 -e 70 -u [email protected] [email protected] ref.bfa [email protected] 2> [email protected])
    -- CMD: (cd outdir; mv [email protected] all.map)
    -- CMD: (cd outdir; /Users/jozefgecz/Documents/Alison/Maq/maq-0.6.8/maq mapcheck
    ref.bfa all.map > mapcheck.txt)
    [ma_mapcheck] processing Salmonella...
    -- CMD: (cd outdir; /Users/jozefgecz/Documents/Alison/Maq/maq-0.6.8/maq assemble
    -N 2 -Q 60 consensus.cns ref.bfa all.map 2> assemble.log)
    -- CMD: /Users/jozefgecz/Documents/Alison/Maq/maq-0.6.8/maq cns2fq
    outdir/consensus.cns > outdir/cns.fq
    -- CMD: /Users/jozefgecz/Documents/Alison/Maq/maq-0.6.8/maq cns2snp
    outdir/consensus.cns > outdir/cns.snp
    -- CMD: /Users/jozefgecz/Documents/Alison/Maq/maq-0.6.8/maq cns2win
    outdir/consensus.cns > outdir/cns.win
    -- CMD: /Users/jozefgecz/Documents/Alison/Maq/maq-0.6.8/maq indelsoa
    outdir/ref.bfa outdir/all.map > outdir/cns.indelse
    -- CMD: (cd outdir; touch unmap.indel)
    -- CMD: /usr/local/bin/maq.pl SNPfilter -q 40 -w 5 -N 2 -f outdir/cns.indelse -d
    3 -D 256 -n 20 outdir/cns.snp > outdir/cns.final.snp
    -- 0 potential soa-indels pass the filter.
    -- CMD: (cd outdir; ln -s cns.final.snp cns.filter.snp)
    -- CMD: /usr/local/bin/maq.pl statmap outdir/*.map.log

    -- == statmap report ==

    -- # single end (SE) reads: 1879809
    -- # mapped SE reads: 0 (/ 1879809 = 0%)
    -- # paired end (PE) reads: 0
    -- # mapped PE reads: 0 (/ 0 = NA%)
    -- # reads that are mapped in pairs: 0 (/ 0 = NA%)
    -- # Q>=30 reads that are moved to meet mate-pair requirement: 0 (/ 0 = NA%)
    -- # Q<30 reads that are moved to meet mate-pair requirement: 0 (NA%)

    What am I doing wrong? I just want to trial the Maq program in preparation for when we receive our sequence.

    thanks

    alig

  • #2
    Alig,

    Is there any reason why you couldn't use a different tool for short read mapping? Novoalign and Bowtie both support maq's output .map format.
    Versions of these tools are available for Mac.

    Comment


    • #3
      I'm not sure, but that fastq file could be using the solexa quality scores where maq requires sanger quality scores
      try this:

      maq sol2sanger Typhi_CT18_solexa.fastq Typhi_CT18_solexa.sanger.fastq
      maq.pl easyrun -d outdir S_typhiCT18.fasta Typhi_CT18_solexa.sanger.fastq

      Comment


      • #4
        What is the [email protected] looks like?

        To zee:

        I used to study various aligners and am also facinated about how fast better alignment algorithms/software are coming up. I even write another aligner that is much faster than maq. However, whenever I want to do something serious I usually return to maq. Maq is prooved to be acceptable in several large-scale projects. I do not know how the other tools generally perform on real data. Maybe someone (not me) need to carefully benchmark these various programs on both simulated and real data. This would benefit the whole community and will convince me that maq is not the way to go. Actually, I see this more important than having a new algorithm.
        Last edited by lh3; 12-10-2008, 09:14 AM.

        Comment


        • #5
          Thanks, I tried
          maq sol2sanger Typhi_CT18_solexa.fastq Typhi_CT18_solexa.sanger.fastq

          to convert from solexa quality scores to Sanger, but it made no difference.

          the [email protected] looks like?

          more [email protected]
          -- maq-dummy
          [ma_load_reads] loading reads...
          [ma_load_reads] set length of the first read as 51.
          [ma_load_reads] 1879809*2 reads loaded.
          [ma_longread2read] encoding reads... 3759618 sequences processed.
          [ma_match] set the minimum insert size as 52.
          [match_core] round 1/3...
          [match_core] making index...
          [match_core] processing sequence Salmonella (4809037 bp)...
          [match_core] round 2/3...
          [match_core] making index...
          [match_core] processing sequence Salmonella (4809037 bp)...
          [match_core] round 3/3...
          [match_core] making index...
          [match_core] processing sequence Salmonella (4809037 bp)...
          [match_core] sorting the hits and dumping the results...
          [ma_load_reads] loading reads...
          [ma_load_reads] 1879809*2 reads loaded.
          [mapping_count_single] 0, 0, 0, 0
          [maq_indel_pe] the indel detector only works with short-insert mate-pair reads.
          -- Dumping unmapped or poorly mapped reads
          [match_data2mapping] 0 out of 3759618 raw reads are mapped with 0 in pairs.
          -- (total, isPE, mapped, paired) = (1879809, 0, 0, 0)


          Novoalign and Bowtie both support maq's output .map format. -
          I've spent a lot of time trying to get Maq up & running. Plus at the moment I'm not getting any Maq output.map format to put into other programs.

          Any help gratefully received
          thanks
          alig

          Comment


          • #6
            That is weird. I have never seen such a logging file. Does the input fastq contains a lot of N?

            Comment


            • #7
              if you post links to where you got the ref & data from ill give them a go on my box

              Comment


              • #8
                Maq problems

                Hi

                The input fastq doesn't appear to contain a lot of N.


                Here are links to where I got the fastq file from

                ftp://ftp.era.ebi.ac.uk/ERA000001/

                I downloaded this file

                Typhi_CT18_solexa.fastq. . . . . Jul 01 08:31 236M [VIEW] [DOWNLOAD]


                For the Fasta file I copy & pasted the sequence of AL513382 from NCBI & put in a BBEdit file.




                Thanks very much

                alig

                Comment


                • #9
                  Dear Li Heng !

                  I got a program with Maq match as follow. Could you help me to solve it ?

                  [ma_load_reads] loading reads...
                  [ma_load_reads] set length of the first read as 116.
                  *** glibc detected *** ./maq-0.7.1/maq: free(): invalid next size (normal): 0x000000000053fc90 ***
                  ======= Backtrace: =========
                  /lib64/libc.so.6[0x2b0ec72c331e]
                  /lib64/libc.so.6(__libc_free+0x6c)[0x2b0ec72c4d7c]
                  ./maq-0.7.1/maq[0x402ee5]
                  ./maq-0.7.1/maq[0x403d0e]
                  ./maq-0.7.1/maq[0x40a4ba]
                  /lib64/libc.so.6(__libc_start_main+0xf4)[0x2b0ec7275184]
                  ./maq-0.7.1/maq(__gxx_personality_v0+0x81)[0x401c09]
                  ======= Memory map: ========
                  00400000-00436000 r-xp 00000000 09:08 4898947 /scratch/huy/NGS/maq-0.7.1/maq
                  00536000-00539000 rw-p 00036000 09:08 4898947 /scratch/huy/NGS/maq-0.7.1/maq
                  00539000-0055a000 rw-p 00539000 00:00 0 [heap]
                  2b0ec6bb3000-2b0ec6bce000 r-xp 00000000 09:01 1625186 /lib64/ld-2.4.so

                  The FastQ file looks like :
                  @SRR013305.1 :5:1:249:548 length=56
                  AGAATTTAAACTGTAAATCGATTTTGTAAGTTTAAAGATCGGAAGAGCTCGCATGC
                  +SRR013305.1 :5:1:249:548 length=56
                  IIIIIIIIIIIIIIIIIIIIIII'IIIIIIIIIII6I>IIII1=IIC*84I*9<CI
                  @SRR013305.2 :5:1:247:609 length=56
                  AGTGAAGTTGGGATTTTTGTTGTTTGTGATTGTATTTATTTGTTTTTTGATTTTGT
                  +SRR013305.2 :5:1:247:609 length=56
                  IIIIIIIIIIII/III;IIIIIIIIIIIII+5IIDIIIII&'III&I>8+';A-%%
                  @SRR013305.3 :5:1:183:608 length=56
                  AAGATTTAAAATCGTAAACGGATTTTAAAGGCGTAAGAATTGTTTTTTTGTTGGAG
                  +SRR013305.3 :5:1:183:608 length=56
                  IIIIIIIIIIIIIIIIIIIII>IIII9I6IIII%*/I7#;083!I@AI.&.0+#("
                  @SRR013305.4 :5:1:205:590 length=56
                  AGAGAATTAAAAGTAGAGGAAGTATGAGATTTTAATTTCGTGGGTTATAATTGGAG
                  +SRR013305.4 :5:1:205:590 length=56
                  IIIIIIIIIIIIIIIIIIIIIIIIIICIEIIII8+III;EFH211=%I(-$A2&")

                  Comment


                  • #10
                    Hi Group,
                    Say `Hello` to everyone!!!
                    I am a beginner to conduct data analysis for short sequence reads, generated by Illumina/Solexa platform. And just starts to run Maq on Ubuntu9.04 64-bit operating system (also a beginner for Linux).
                    Starting to run the example of Maq (calib-36.dat), one error jumps out. The detail as follows:
                    ~/Desktop/example$ maq.pl demo ref.fasta calib-36.dat
                    -- CMD: mkdir -p maqdemo
                    -- CMD: /usr/local/bin/maq simulate -N 1000000 maqdemo/r1.fq maqdemo/r2.fq ref.fasta calib-36.dat > maqdemo/true.snp
                    -- 1 sequences, total length: 4130
                    -- CMD: /usr/local/bin/maq fasta2bfa ref.fasta maqdemo/ref.bfa
                    -- 1 sequences have been converted.
                    -- CMD: (cd maqdemo; /usr/local/bin/maq.pl easyrun -p -d easyrun ref.bfa r1.fq r2.fq)
                    -- CMD: ln -sf /home/shanyuan/Desktop/example/maqdemo/ref.bfa easyrun/ref.bfa
                    -- CMD: /usr/local/bin/maq fastq2bfq -n 2000000 /home/shanyuan/Desktop/example/maqdemo/r1.fq easyrun/read1
                    -- finish writing file 'easyrun/[email protected]'
                    -- 1000000 sequences were loaded.
                    -- CMD: /usr/local/bin/maq fastq2bfq -n 2000000 /home/shanyuan/Desktop/example/maqdemo/r2.fq easyrun/read2
                    -- finish writing file 'easyrun/[email protected]'
                    -- 1000000 sequences were loaded.
                    -- CMD: (cd easyrun; /usr/local/bin/maq map -n 2 -e 70 [email protected] ref.bfa [email protected] [email protected] 2> [email protected])
                    -- CMD: (cd easyrun; mv [email protected] all.map)
                    -- CMD: (cd easyrun; /usr/local/bin/maq mapcheck ref.bfa all.map > mapcheck.txt)
                    [ma_mapcheck] processing NC_009497_Mycoplasma...
                    -- CMD: (cd easyrun; /usr/local/bin/maq assemble -N 2 -Q 60 consensus.cns ref.bfa all.map 2> assemble.log)
                    -- CMD: /usr/local/bin/maq cns2fq easyrun/consensus.cns > easyrun/cns.fq
                    -- CMD: /usr/local/bin/maq cns2snp easyrun/consensus.cns > easyrun/cns.snp
                    -- CMD: /usr/local/bin/maq cns2win easyrun/consensus.cns > easyrun/cns.win
                    -- CMD: /usr/local/bin/maq indelsoa easyrun/ref.bfa easyrun/all.map > easyrun/cns.indelse
                    -- CMD: /usr/local/bin/maq indelpe easyrun/ref.bfa easyrun/all.map > easyrun/cns.indelpe
                    -- CMD: /usr/local/bin/maq.pl SNPfilter -q 40 -w 5 -N 2 -f easyrun/cns.indelse -F easyrun/cns.indelpe -d 3 -D 256 -n 20 -Q60 easyrun/cns.snp > easyrun/cns.final.snp
                    -- 1 potential soa-indels pass the filter.
                    -- 3 potential pe-indels pass the filter.
                    -- CMD: (cd easyrun; ln -s cns.final.snp cns.filter.snp)
                    -- CMD: /usr/local/bin/maq.pl statmap easyrun/*.map.log

                    -- == statmap report ==

                    -- # single end (SE) reads: 0
                    -- # mapped SE reads: 0 (/ 0 = NA%)
                    -- # paired end (PE) reads: 2000000
                    -- # mapped PE reads: 1999994 (/ 2000000 = 99.99%)
                    -- # reads that are mapped in pairs: 1999602 (/ 1999994 = 99.98%)
                    -- # Q>=30 reads that are moved to meet mate-pair requirement: 0 (/ 1999602 = 0%)
                    -- # Q<30 reads that are moved to meet mate-pair requirement: 0 (0%)

                    -- CMD: (cd maqdemo; /usr/local/bin/maq simustat easyrun/all.map > eval.simustat)
                    *** buffer overflow detected ***: /usr/local/bin/maq terminated
                    ======= Backtrace: =========
                    /lib/libc.so.6(__fortify_fail+0x37)[0x7fd2139ea2c7]
                    /lib/libc.so.6[0x7fd2139e8170]
                    /usr/local/bin/maq[0x423e7e]
                    /usr/local/bin/maq[0x424425]
                    /lib/libc.so.6(__libc_start_main+0xe6)[0x7fd2139095a6]
                    /usr/local/bin/maq[0x401da9]
                    ======= Memory map: ========
                    00400000-00438000 r-xp 00000000 08:03 12468332 /usr/local/bin/maq
                    00637000-00638000 r--p 00037000 08:03 12468332 /usr/local/bin/maq
                    00638000-0063b000 rw-p 00038000 08:03 12468332 /usr/local/bin/maq
                    0141f000-01440000 rw-p 0141f000 00:00 0 [heap]
                    7fd2138eb000-7fd213a53000 r-xp 00000000 08:03 9961511 /lib/libc-2.9.so
                    7fd213a53000-7fd213c53000 ---p 00168000 08:03 9961511 /lib/libc-2.9.so
                    7fd213c53000-7fd213c57000 r--p 00168000 08:03 9961511 /lib/libc-2.9.so
                    7fd213c57000-7fd213c58000 rw-p 0016c000 08:03 9961511 /lib/libc-2.9.so
                    7fd213c58000-7fd213c5d000 rw-p 7fd213c58000 00:00 0
                    7fd213c5d000-7fd213c73000 r-xp 00000000 08:03 9961533 /lib/libgcc_s.so.1
                    7fd213c73000-7fd213e73000 ---p 00016000 08:03 9961533 /lib/libgcc_s.so.1
                    7fd213e73000-7fd213e74000 r--p 00016000 08:03 9961533 /lib/libgcc_s.so.1
                    7fd213e74000-7fd213e75000 rw-p 00017000 08:03 9961533 /lib/libgcc_s.so.1
                    7fd213e75000-7fd213ef9000 r-xp 00000000 08:03 9961544 /lib/libm-2.9.so
                    7fd213ef9000-7fd2140f8000 ---p 00084000 08:03 9961544 /lib/libm-2.9.so
                    7fd2140f8000-7fd2140f9000 r--p 00083000 08:03 9961544 /lib/libm-2.9.so
                    7fd2140f9000-7fd2140fa000 rw-p 00084000 08:03 9961544 /lib/libm-2.9.so
                    7fd2140fa000-7fd2141eb000 r-xp 00000000 08:03 12290468 /usr/lib/libstdc++.so.6.0.10
                    7fd2141eb000-7fd2143eb000 ---p 000f1000 08:03 12290468 /usr/lib/libstdc++.so.6.0.10
                    7fd2143eb000-7fd2143f2000 r--p 000f1000 08:03 12290468 /usr/lib/libstdc++.so.6.0.10
                    7fd2143f2000-7fd2143f4000 rw-p 000f8000 08:03 12290468 /usr/lib/libstdc++.so.6.0.10
                    7fd2143f4000-7fd214407000 rw-p 7fd2143f4000 00:00 0
                    7fd214407000-7fd21441e000 r-xp 00000000 08:03 9961629 /lib/libz.so.1.2.3.3
                    7fd21441e000-7fd21461d000 ---p 00017000 08:03 9961629 /lib/libz.so.1.2.3.3
                    7fd21461d000-7fd21461e000 r--p 00016000 08:03 9961629 /lib/libz.so.1.2.3.3
                    7fd21461e000-7fd21461f000 rw-p 00017000 08:03 9961629 /lib/libz.so.1.2.3.3
                    7fd21461f000-7fd21463f000 r-xp 00000000 08:03 9961491 /lib/ld-2.9.so
                    7fd21482c000-7fd21482f000 rw-p 7fd21482c000 00:00 0
                    7fd21483a000-7fd21483e000 rw-p 7fd21483a000 00:00 0
                    7fd21483e000-7fd21483f000 r--p 0001f000 08:03 9961491 /lib/ld-2.9.so
                    7fd21483f000-7fd214840000 rw-p 00020000 08:03 9961491 /lib/ld-2.9.so
                    7fff1c82a000-7fff1c83f000 rw-p 7ffffffea000 00:00 0 [stack]
                    7fff1c9fe000-7fff1c9ff000 r-xp 7fff1c9fe000 00:00 0 [vdso]
                    ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
                    Aborted
                    ** fail to run command '(cd maqdemo; /usr/local/bin/maq simustat easyrun/all.map > eval.simustat)' at /usr/local/bin/maq.pl line 842.

                    It would be nice to have your suggestions on how to fix the bug. Your kind reply will be much appreciated.

                    Best,
                    jay

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Essential Discoveries and Tools in Epitranscriptomics
                      by seqadmin


                      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
                      Yesterday, 07:01 AM
                    • seqadmin
                      Current Approaches to Protein Sequencing
                      by seqadmin


                      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                      04-04-2024, 04:25 PM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, 04-11-2024, 12:08 PM
                    0 responses
                    39 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 10:19 PM
                    0 responses
                    41 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 09:21 AM
                    0 responses
                    35 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-04-2024, 09:00 AM
                    0 responses
                    55 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X