Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using MAQ with Illumina HiSeq results

    Hi,
    I want to try MAQ (for the first time) for analysis of Illumina HiSeq human whole exome results, and I have two questions:

    1. Is it ok to use hg19_chromFa.tar (from UCSC) as a reference, or should I run it again each chromosome individually ?

    2. The text file I got from the HiSeq (which is fastq actually) is ok as an input for maq fasta2bfa command, or should I change its fastq format? I think that Hiseq output is already in the Sanger fastq format, but I'm not sure? below I copied the beginning of the txt file.

    Thanks!



    @ILLUMINA-FFC6C4_0005:7:1:1941:1087#0/1
    CACATTGGATTGATCGGTCTCATTGGCCCCCCGGGAGAAGCTGGGGAGAAAGGAGATCAGGGGGTGCCAGGCGT
    +ILLUMINA-FFC6C4_0005:7:1:1941:1087#0/1
    faf\f_ccfffffSedaRe\dcdYffffcggg`g^ae`ca^RbaJ_VWW_Z\\Na`d``]a`bGM[UYVa`]`B
    @ILLUMINA-FFC6C4_0005:7:1:2045:1092#0/1
    GTGTGAATTTCATTTCCACATAAATTTTCTGAGCTGCATCACGGGAGATCCAGTTTGTACGAAGCCAGTTGTTT
    +ILLUMINA-FFC6C4_0005:7:1:2045:1092#0/1
    fffff_ffff[a\feaacafffffgfggcff]f]ae`beffadffafcf^[ac^dWd^abe`[d`_be_BBBBB
    @ILLUMINA-FFC6C4_0005:7:1:2497:1094#0/1
    GACGCTCACTCTCTCTGGTATAACTTCACCATCATTCATTTGCCCAGACATGGGCAACAGTGGTGTGAGGTCCA
    +ILLUMINA-FFC6C4_0005:7:1:2497:1094#0/1
    cbbd][dbdffffcbcccRa^ab^cff[d^Qaa^f[fefg_gcec`_f[Y^a^_ccaa_aZ`dYa[`Y``Y^Z[
    @ILLUMINA-FFC6C4_0005:7:1:2730:1091#0/1
    GGAACACACAGCTTCCCAGCTTTGGACAGTTGGTACAGCCTGAGGATGAGGGAAGCCAAGAACAAAAAACACCA
    +ILLUMINA-FFC6C4_0005:7:1:2730:1091#0/1
    ddaa`da`aacZa^BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB

  • #2
    Originally posted by Lilach View Post
    I think that Hiseq output is already in the Sanger fastq format, but I'm not sure?
    That string of BBBBBBBBB qualities in the fourth read looks like the PHRED Q2 marker for bad signal, which means this is in the old Illumina encoding, not the Sanger encoding used in the more recent Illumina pipelines. See:

    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

    Comment


    • #3
      Originally posted by Lilach View Post
      Hi,
      I want to try MAQ (for the first time) for analysis of Illumina HiSeq human whole exome results, and I have two questions:
      Do you have any particular reason to use MAQ? I think it is a bit outdated and nowdays most people use bwa, bowtie (1 or 2) or similar. I think MAQ was designed when sequence files were a few million reads and with the output of HiSeq (100s millions) it might take ages and/or require a lot of memory.

      1. Is it ok to use hg19_chromFa.tar (from UCSC) as a reference, or should I run it again each chromosome individually ?
      Not sure, but in general you don't want to split the reference sequence otherwise you can't tell whether a read aligns equally well to different chromosomes (well, you can but it would be more work downstream of the alignment which I don't think it pays off). To save time and parallelize you can split the sequence files though, unless it is RNAseq data you have.

      Hope I'm not misunderstanding your question...

      Best
      Dario

      Comment


      • #4
        MAQ's problem is that it's not a fast aligner. You want one of the Burrows-Wheeler Transform algorithms. That means Bowtie or bwa.

        And yes, in general, you want to align to the whole reference at one go. These algorithms will always try to fit your reads to the reference they are given, so you want to give the program your whole references. If a read aligns to Chr 6 perfectly, you don't want the software to be telling you it aligns to Chr 1 with two errors, but if you only give the software Chr 1 to align to, that's what it will do.

        Comment


        • #5
          Thank you for the answers!
          So I read a little and it seems as Illimuna 1.5 fastq, becuase of the BBBBBBB strings.
          Can I use BWA aln and sampe directly on these files, or should I reformat them to Sanger fastq?

          Regarding MAQ - I wanted to compare its results to BWA. I already used BWA, but now I'm afraid the fastq qualities were not interpreted well?

          Comment


          • #6
            I would have thought MAQ was totally outdated now. Does it even output to SAM?

            With Illumina 1.3+ to 1.7 you need to use the -I flag with BWA.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM
            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 06:37 PM
            0 responses
            11 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 06:07 PM
            0 responses
            10 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-22-2024, 10:03 AM
            0 responses
            51 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-21-2024, 07:32 AM
            0 responses
            68 views
            0 likes
            Last Post seqadmin  
            Working...
            X