Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • ester
    Member
    • Jun 2008
    • 11

    Splitting NuGen barcodes from paired-end sequences

    Hi all,

    Does anybody know a software to split Nugen barcodes that supports PAIRED-END reads?

    Thanks,

    Ester
  • axgraf
    Junior Member
    • Apr 2011
    • 7

    #2
    Hi Ester,
    I wrote a small program for that issue.

    The tool filters the reads by searching for the barcode only in the first read. If found the barcode is removed and written to the output. Note that the input files must be in order.
    Small example:

    java -Xmx4g -jar DemultiplexNUGEN.jar -i1 laneX_1.fastq laneY_1.fastq ... -i2 laneX_2.fastq laneY_2.fastq ... -b ATTG -o1 ATTG_demulitplex_1.fastq -o2 ATTG_demultiplex_2.fastq -s

    Hope I could help. Please keep me informed if it works.

    Alex
    Last edited by axgraf; 08-10-2011, 06:18 AM.

    Comment

    • ester
      Member
      • Jun 2008
      • 11

      #3
      Splitting NuGen barcodes from paired-end sequences

      Hi Alex,

      Thanks for your help.

      I tried to run your program with the following command:

      java -Xmx4g -jar DemultiplexNUGEN.jar -i1 s_7_1_sequence.txt -i2 s_7_2_sequence.txt -b ACCC -o1 test.1
      -o2 test.2 -s

      and got the following error:

      de.genzentrum.lafuga.NotFastqFormatException: Read1 has not the same identifier as read2
      at de.genzentrum.lafuga.trimmer.Demultiplex.iterateFastqPairedEnd(Demultiplex.java:96)
      at de.genzentrum.lafuga.main.MainPairedEnd.main(MainPairedEnd.java:70)

      The input files:

      >head s_7_1_sequence.txt
      @HWI-ST611_0176:7:1:1226:2054#0/1
      NGTACTCGTCCACGTCGTTCTCAGAGAGAATATTCTCTCTCCACACATCAGCAGTTAAGGAGGATGTGAAGACAATCTTTTCAACACTATCGGTCTGAGC
      +HWI-ST611_0176:7:1:1226:2054#0/1
      BYWYW[ZZZZcccccc_cccccccccccccc_ccccccccccccccccc\ccc_ccc\cccc_\cccccVccac______YUcUc\^^^\^^XZ^[X\\\
      @HWI-ST611_0176:7:1:1161:2111#0/1
      GAGTAGGCCACGCNTTCACGGTTCGTATTCGTGCTGGAAATCAGAATCAAACGAGCTTTTACCCTTTTGTTCCACACGAGATTTCTGTTCTCGTTGAGCT
      +HWI-ST611_0176:7:1:1161:2111#0/1
      gggeggggggcccBccccccggggfdgeggdbdddgfgfgdgggggeefgegeggbeegedea[gfedaagZeed]]bb`eedfegXgggabaddYaeca
      @HWI-ST611_0176:7:1:1197:2111#0/1
      GAGCCGCCCGCTCTCTGCTTTCCAAGCCTTTGCGATCTGCTTAAGCAGCTTTGACACCAGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTC
      arkady Melon_2011/data> head s_7_2_sequence.txt
      @HWI-ST611_0176:7:1:1226:2054#0/2
      CAAATGGTGGATTTGGAGGTTAGAGGAACAATTAATGTCGTCGAGGCTTGTGCTCAGACCGATAGTGTTGAAAAGATTGTCTTCACATCCTCCTTAACTG
      +HWI-ST611_0176:7:1:1226:2054#0/2
      gggggggegggggggggggggggdgggggggggggggggggggggggge^cd`cddfeeffbe`d`dddd]eee_XddacaddW[aca`cadcbeMdcbT
      @HWI-ST611_0176:7:1:1161:2111#0/2
      GGTGGGCCGATCCGGGCGGAAGACATTGTCAGGTGGGGAGTTTGGCTGGGGGCGGCACATCTGTTAAAAGATAACGCAGGTGTTCTAAGATGAGCTCAAC
      +HWI-ST611_0176:7:1:1161:2111#0/2
      fhdgbgggddfefffegfggfbggggddegeea^eedd^deeebecee^cadUXd\TV]`a[]bdfeeda\VadaabcdcK^V\E]U[TY]Ybbbdb[d\
      @HWI-ST611_0176:7:1:1197:2111#0/2
      GGTGTCAAAGCTGCTTAAGCAGATCGCAAAGGCTTGGAAAGCAGAGAGCGGGCGGCTCAGATCGGAAGGGCGTCGTGTAGGGAAAGAGGGGAGATTTCGG


      Can you help with this?

      Thanks again,
      Ester

      Comment

      • axgraf
        Junior Member
        • Apr 2011
        • 7

        #4
        Hi Ester,
        The tool compared the identifier of the reads and stopped because the names weren't the same.
        I missed the fact, that paired-end reads could have
        a "/1" and "/2" at the end of the identifier, which aren't present in our reads.

        I changed the code, so that it should work for your files.

        Alex
        Attached Files

        Comment

        • ester
          Member
          • Jun 2008
          • 11

          #5
          Hi Alex,

          Still having problems:


          java.lang.NullPointerException
          at java.io.File.<init>(Unknown Source)
          at de.genzentrum.lafuga.trimmer.Demultiplex.iterateFastqPairedEnd(Demultiplex.java:74)
          at de.genzentrum.lafuga.main.MainPairedEnd.main(MainPairedEnd.java:70)


          Thanks,

          Ester

          Comment

          • axgraf
            Junior Member
            • Apr 2011
            • 7

            #6
            Have you used the same parameter as in the last post?
            It seems to me, that the -o2 switch was not set.

            If I use the same parameter and the same sequences as in the last post, I can run it successfully.

            If you copy the parameter out of your last post, the "-o2 test.2 -s"
            line is missing.

            That could have caused the file not found exception.

            Otherwise I need the exact parameter which you have used.

            Alex

            Comment

            • ester
              Member
              • Jun 2008
              • 11

              #7
              Hi Alex,

              You are right. It was my mistake.
              The program run but the output file is missing the read name after the +:

              arkady Melon_2011/data> more test.1
              @HWI-ST611_0176:7:1:2764:2469#0/1
              AGGAGTCCGGTATTGTTATTTATTGTCACTGCCTCCCCGTGTCAGGATTGGGTAGATCGGAAGAGCGGTTCTGCAGGAATGCCGAGACCGATACCG
              +
              gggfggggggdggggggggggggggggegeTedcdeggdfgccZegada`ecXabZX_``\`bMYY`aM^\ZX[S^dabXbBBBBBBBBBBBBBBB
              @HWI-ST611_0176:7:1:5412:2350#0/1
              CCGGGTGACGGAGAATTAGGGTTCGATTCCGGAGAGGGAGCCTGAGAAACGGCTACCACATCCAAGGAAGGCAGCAGGCGCGCAAATTACCCAATC
              +
              gggfg_gegggggegggdggfggggegggggeggaggd\eefcdbdd[edd`ddeX\\aBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB

              Thanks again,

              Ester

              Comment

              • eslondon
                Member
                • Jul 2009
                • 21

                #8
                You can also use novobarcode, part of Novoalign, to split reads in "buckets" based on barcodes.
                --------------------------------------
                Elia Stupka
                Co-Director and Head of Unit
                Center for Translational Genomics and Bioinformatics
                San Raffaele Scientific Institute
                Via Olgettina 58
                20132 Milano
                Italy
                ---------------------------------------

                Comment

                • axgraf
                  Junior Member
                  • Apr 2011
                  • 7

                  #9
                  You are right.
                  Sorry for that. This tools was used up to now only here at our institute.
                  I changed it.
                  Hope everything is fine now.

                  Alex
                  Attached Files

                  Comment

                  • ester
                    Member
                    • Jun 2008
                    • 11

                    #10
                    Now it works fine.
                    Thanks a lot,
                    Ester

                    Comment

                    • senpeng
                      Member
                      • Sep 2011
                      • 10

                      #11
                      Originally posted by axgraf View Post
                      Hi Ester,
                      The tool compared the identifier of the reads and stopped because the names weren't the same.
                      I missed the fact, that paired-end reads could have
                      a "/1" and "/2" at the end of the identifier, which aren't present in our reads.

                      I changed the code, so that it should work for your files.

                      Alex
                      Dear Alex,
                      we met the similar problem, and our input format is for CASAVA 1.8, a little bit different with the former one (the position of "1" and "2")
                      our input are as follows:
                      @HWUSI-EAS174:6:FC:1:1:1153:945 1:Y:0:
                      GGGAGGTCGAGGCTGTAGTGAGCTGGGATCGTACCATTTCTCTCATTACGAGATCGGAAGAGCGTGGTGTTGGGACTGAGTGTAGATCTCGGTGGGCGGC
                      +
                      25+=70.6;1@@;,;A?=?:19)7;*+++5+?=;+.7;<)3>61*?=;:=BD?B@?222=?8+BB###################################
                      @HWUSI-EAS174:6:FC:1:1:1288:931 1:Y:0:
                      GAGGTCGGCTTGGAGTCAGAAAGCTCGGGGCATTGTCTCAGGTCTGTTGCTTCCTAGGAGTGTGAACGATGAGGAAGTTCCTGCATCGCTGAGGACTCAG
                      +
                      ?+@=6;2;@==B;54;=;=+:785+--/77B?B?D#################################################################
                      @HWUSI-EAS174:6:FC:1:1:1305:938 1:Y:0:
                      GGGTTCGCTCGGTGAACTGCACGCCCTTTGAAATGTCTCCTCTCGATTTGGGTGTTTTACTTGATTTTTCTTATATCTTACATCTTTTCTTTAGTCTGTC
                      +
                      ####################################################################################################
                      @HWUSI-EAS174:6:FC:1:1:1528:951 1:Y:0:
                      CGCAAGGACAAAAAACCAAATACTGCATGTTCTCAATCATAGGTGGGAATTGAACAATGAGAACACAGGGACACAGGAACACTCAGATCGGAAGAGCGTC
                      +
                      IIIHIHBBIGIIIIIIIIHHBIHIIIIEBIGIIIIDGGBGGGGDGGADGEIIIIGDGEGIHFGHI<IHDGE@HHBIFF@FIFBHHIEG@@HEDDEE>B>3
                      @HWUSI-EAS174:6:FC:1:1:1551:943 1:Y:0:
                      GGAGGCTGCTTTTAGGCCTACTATGGGTGTTAAATTTTTTACTCTCTCAAACACCGGGCAGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCG
                      +
                      ED=D4FEEE?8BB@B4FBFEE4BFD/:0:4B?;8B*45402921;+86=4CCE?EDDB+DA<ACAD@<GB<0><6?>:4??C>1?###############
                      @HWUSI-EAS174:6:FC:1:1:1588:935 1:Y:0:
                      CCGTGATAGTTTTTAGGTGTTAGACACCCCACCTTAAGCTTGTACCTGAAAGCTTTATCTCGTTATAAATAATTCACTGTAATTTAGGGGAGGTATGTCC
                      +
                      2+85::1:77)::1:=+9=@@32,@=3<;99@@@F=@4B8B?7C:B?CAB=??8E734282B==77241@##############################

                      Thus when I run the java, it still shows "Read 1 has not the same identifier as read 2", would you pls help me solve that?

                      Thanks so much

                      Comment

                      Latest Articles

                      Collapse

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by SEQadmin2, 06-05-2026, 10:09 AM
                      0 responses
                      18 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-04-2026, 08:59 AM
                      0 responses
                      34 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 12:03 PM
                      0 responses
                      54 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 11:40 AM
                      0 responses
                      24 views
                      0 reactions
                      Last Post SEQadmin2  
                      Working...