Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Lilach
    Member
    • Sep 2011
    • 20

    Using MAQ with Illumina HiSeq results

    Hi,
    I want to try MAQ (for the first time) for analysis of Illumina HiSeq human whole exome results, and I have two questions:

    1. Is it ok to use hg19_chromFa.tar (from UCSC) as a reference, or should I run it again each chromosome individually ?

    2. The text file I got from the HiSeq (which is fastq actually) is ok as an input for maq fasta2bfa command, or should I change its fastq format? I think that Hiseq output is already in the Sanger fastq format, but I'm not sure? below I copied the beginning of the txt file.

    Thanks!



    @ILLUMINA-FFC6C4_0005:7:1:1941:1087#0/1
    CACATTGGATTGATCGGTCTCATTGGCCCCCCGGGAGAAGCTGGGGAGAAAGGAGATCAGGGGGTGCCAGGCGT
    +ILLUMINA-FFC6C4_0005:7:1:1941:1087#0/1
    faf\f_ccfffffSedaRe\dcdYffffcggg`g^ae`ca^RbaJ_VWW_Z\\Na`d``]a`bGM[UYVa`]`B
    @ILLUMINA-FFC6C4_0005:7:1:2045:1092#0/1
    GTGTGAATTTCATTTCCACATAAATTTTCTGAGCTGCATCACGGGAGATCCAGTTTGTACGAAGCCAGTTGTTT
    +ILLUMINA-FFC6C4_0005:7:1:2045:1092#0/1
    fffff_ffff[a\feaacafffffgfggcff]f]ae`beffadffafcf^[ac^dWd^abe`[d`_be_BBBBB
    @ILLUMINA-FFC6C4_0005:7:1:2497:1094#0/1
    GACGCTCACTCTCTCTGGTATAACTTCACCATCATTCATTTGCCCAGACATGGGCAACAGTGGTGTGAGGTCCA
    +ILLUMINA-FFC6C4_0005:7:1:2497:1094#0/1
    cbbd][dbdffffcbcccRa^ab^cff[d^Qaa^f[fefg_gcec`_f[Y^a^_ccaa_aZ`dYa[`Y``Y^Z[
    @ILLUMINA-FFC6C4_0005:7:1:2730:1091#0/1
    GGAACACACAGCTTCCCAGCTTTGGACAGTTGGTACAGCCTGAGGATGAGGGAAGCCAAGAACAAAAAACACCA
    +ILLUMINA-FFC6C4_0005:7:1:2730:1091#0/1
    ddaa`da`aacZa^BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
  • maubp
    Peter (Biopython etc)
    • Jul 2009
    • 1544

    #2
    Originally posted by Lilach View Post
    I think that Hiseq output is already in the Sanger fastq format, but I'm not sure?
    That string of BBBBBBBBB qualities in the fourth read looks like the PHRED Q2 marker for bad signal, which means this is in the old Illumina encoding, not the Sanger encoding used in the more recent Illumina pipelines. See:

    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

    Comment

    • dariober
      Senior Member
      • May 2010
      • 311

      #3
      Originally posted by Lilach View Post
      Hi,
      I want to try MAQ (for the first time) for analysis of Illumina HiSeq human whole exome results, and I have two questions:
      Do you have any particular reason to use MAQ? I think it is a bit outdated and nowdays most people use bwa, bowtie (1 or 2) or similar. I think MAQ was designed when sequence files were a few million reads and with the output of HiSeq (100s millions) it might take ages and/or require a lot of memory.

      1. Is it ok to use hg19_chromFa.tar (from UCSC) as a reference, or should I run it again each chromosome individually ?
      Not sure, but in general you don't want to split the reference sequence otherwise you can't tell whether a read aligns equally well to different chromosomes (well, you can but it would be more work downstream of the alignment which I don't think it pays off). To save time and parallelize you can split the sequence files though, unless it is RNAseq data you have.

      Hope I'm not misunderstanding your question...

      Best
      Dario

      Comment

      • swbarnes2
        Senior Member
        • May 2008
        • 910

        #4
        MAQ's problem is that it's not a fast aligner. You want one of the Burrows-Wheeler Transform algorithms. That means Bowtie or bwa.

        And yes, in general, you want to align to the whole reference at one go. These algorithms will always try to fit your reads to the reference they are given, so you want to give the program your whole references. If a read aligns to Chr 6 perfectly, you don't want the software to be telling you it aligns to Chr 1 with two errors, but if you only give the software Chr 1 to align to, that's what it will do.

        Comment

        • Lilach
          Member
          • Sep 2011
          • 20

          #5
          Thank you for the answers!
          So I read a little and it seems as Illimuna 1.5 fastq, becuase of the BBBBBBB strings.
          Can I use BWA aln and sampe directly on these files, or should I reformat them to Sanger fastq?

          Regarding MAQ - I wanted to compare its results to BWA. I already used BWA, but now I'm afraid the fastq qualities were not interpreted well?

          Comment

          • cam.jack
            Member
            • Jun 2011
            • 11

            #6
            I would have thought MAQ was totally outdated now. Does it even output to SAM?

            With Illumina 1.3+ to 1.7 you need to use the -I flag with BWA.

            Comment

            Latest Articles

            Collapse

            • SEQadmin2
              Nine Things a Sample Prep Scientist Thinks About Before Sequencing
              by SEQadmin2


              I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

              Here are nine questions we think about, in roughly the order they matter, before...
              06-18-2026, 07:11 AM
            • SEQadmin2
              From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
              by SEQadmin2


              Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


              The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
              ...
              06-02-2026, 10:05 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by SEQadmin2, Today, 11:10 AM
            0 responses
            5 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-17-2026, 06:09 AM
            0 responses
            41 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-09-2026, 11:58 AM
            0 responses
            102 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-05-2026, 10:09 AM
            0 responses
            123 views
            0 reactions
            Last Post SEQadmin2  
            Working...