Unconfigured Ad

Collapse
This topic is closed.
X
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • muol
    Member
    • Jun 2012
    • 10

    #16
    Brian,

    I ran into a smaller issue with bbnorm. When trying to input and output separate files for a PE library like this:

    Code:
    bbnorm.sh in1=R1.fastq.gz in2=R2.fastq.gz out1=R1.bbnorm.fastq.gz out2=R2.bbnorm.fastq.gz prefilter=t tossbadreads=t ecc=t fixspikes=t qin=33 -Xmx72g target=40
    I receive this error during pass 2:

    Code:
    Exception in thread "main" java.lang.AssertionError: Please do not set 'interleaved=true' with dual input files.
    	at stream.ConcurrentGenericReadInputStream.<init>(ConcurrentGenericReadInputStream.java:132)
    	at stream.ConcurrentGenericReadInputStream.getReadInputStream(ConcurrentGenericReadInputStream.java:661)
    	at stream.ConcurrentGenericReadInputStream.getReadInputStream(ConcurrentGenericReadInputStream.java:641)
    	at kmer.KmerCount7MTA.countFastq(KmerCount7MTA.java:355)
    	at kmer.KmerCount7MTA.makeKca(KmerCount7MTA.java:222)
    	at jgi.KmerNormalize.runPass(KmerNormalize.java:1006)
    	at jgi.KmerNormalize.main(KmerNormalize.java:736)
    Setting interleaved=false doesn't change that. Outputting to a single, interleaved file (in1=xxx in2=xxx out=xxx) on the other hand works fine. Any ideas?

    Olaf

    Comment

    • Brian Bushnell
      Super Moderator
      • Jan 2014
      • 2709

      #17
      Olaf,

      Currently, BBNorm uses single interleaved files for temporary storage when using multiple passes. And I have not implemented any way to specify dual files in intermediate stages, since everyone at JGI uses interleaved files for everything.

      You have two options.
      1) You could set "passes=1", which is faster, but I don't recommend it because it doesn't give as good results as 2-pass normalization.
      or
      2) You could specify only a single output file, which will get interleaved reads:

      bbnorm.sh in1=R1.fastq.gz in2=R2.fastq.gz out=R12.bbnorm.fastq.gz prefilter=t tossbadreads=t ecc=t fixspikes=t qin=33 -Xmx72g target=40

      ...Then, if you need to, de-interleave it afterward:

      reformat.sh in=R12.bbnorm.fastq.gz out1=R1.bbnorm.fastq.gz out2=R2.bbnorm.fastq.gz

      Sorry for the inconvenience! I'll try to fix that by the next release, though unlike documenting the "qin" flag, this will take more work so no guarantees. Thanks for bringing it to my attention. FYI, the flag "interleaved" has no effect on output, only input.

      -Brian
      Last edited by Brian Bushnell; 06-23-2014, 06:05 PM.

      Comment

      • muol
        Member
        • Jun 2012
        • 10

        #18
        Thanks for the info Brian, it wasn't a big issue.

        Olaf

        Comment

        • Brian Bushnell
          Super Moderator
          • Jan 2014
          • 2709

          #19
          Olaf,

          This has been fixed in the latest release, 33.04

          Comment

          • muol
            Member
            • Jun 2012
            • 10

            #20
            Excellent, just did a test run. This is very useful software!

            Olaf

            Comment

            • sdmoore
              Member
              • Jun 2014
              • 12

              #21
              Possible to add Read Group in BBmap header?

              *sorry, probably wrong thread, I found more activity in the release announcement thread*

              Hello,
              I used BBduk to process my read pairs and then mapped them using BBmap, then sam/bam and sorted.
              I plan to use an alternative to mpileup to process this set (for comparison of the outputs), so I am trying to use GATK tools.

              When running a GATK tool, it reports the error that the readgoup is not found in the header. With other mappers, this is an option (like -R for BWA). I found a methods to manually add readgoup information to the header (such as here), but I have limited linux skills and get errors when trying that approach (command "header" not found). I am also concerned that if I put the wrong RG info, I may pooch a downstream tool.

              Is there a way to make the BBmap output compatible with GATK?
              Last edited by sdmoore; 07-05-2014, 09:34 AM. Reason: wrong thread?

              Comment

              • Brian Bushnell
                Super Moderator
                • Jan 2014
                • 2709

                #22
                sdmoore,

                BBMap does not have an option for setting the readgroup, since I never encountered a situation where I needed it. But if it's useful, I can add it to the next release. The solution in your linked thread looks reasonable and I'm not sure why it didn't work for you; I will let you know if I find a better solution.

                Comment

                • dpryan
                  Devon Ryan
                  • Jul 2011
                  • 3478

                  #23
                  @sdmoore: You actually just want AddOrReplaceReadGroups from Picard tools. The command I expect you were going for is "samtools reheader", though that won't really do what you want since read group information is also added to each alignment.

                  @Brian: It would be great if you could add read group support. That'll be needed by anyone doing SNP calling.

                  Comment

                  • sdmoore
                    Member
                    • Jun 2014
                    • 12

                    #24
                    Thanks Brian and dpryan.
                    I had to give up on bbmap for now, not for this problem (I found the AddOrReplaceReadGroups tool later: I edited the sams or the bams). Rather, the resulting vcf from mpileup on the BBmap alignments were "all over the place" (and took forever to process too), I don't know how else to describe it, large insert calls for a bunch of positions. Viewing the file was no help (tons of insert/asterisks displayed). Same mess from FreeBayes. BWA-mem and Bowtie2 assemblies don't show this and I can easily identify known errors in the reference file with either mpileup or freebayes. The assembly looked more like what I got from cushaw2 (and also dropped). We are at a stage now where we will Sanger sequence a few loci to clear things up (e.g., BWA never shows a collection of mutations that Bowtie2 does). I was hoping to have a third assembler "take sides", but I think it's faster for us to sequence and be sure.

                    Comment

                    • Brian Bushnell
                      Super Moderator
                      • Jan 2014
                      • 2709

                      #25
                      sdmoore,

                      By default, BBMap will look for much longer indels than BWA/Bowtie2, over 16000bp. You can limit this with the maxindel flag (e.g. "maxindel=40"). Soft-clipping (via "local" flag) can also reduce erroneous variation calls from chimeric or low-quality reads.

                      Comment

                      Latest Articles

                      Collapse

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by SEQadmin2, 06-09-2026, 11:58 AM
                      0 responses
                      25 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-05-2026, 10:09 AM
                      0 responses
                      33 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-04-2026, 08:59 AM
                      0 responses
                      39 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 12:03 PM
                      0 responses
                      62 views
                      0 reactions
                      Last Post SEQadmin2  
                      Working...