Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • error while maaping paired end reads in Maq

    Hi,

    This is my first post.
    I am running maq map for paired end SOLiD reads and ive gone through all the initial procedures of buildinf .bfq files and ref.csbfa sequence

    when i run the command:
    maq map -c aln.cs.map ref.csbfa in.read1.bfq in.read2.bfq 2>aln.log

    i always get the following error:
    maq: read.cc:61: longreads_t* ma_load_reads(void*, int, void*, int): Assertion `strncmp(name, lr->name[j], tl-1) == 0' failed.

    Could someone please tell me what this mean? What are the strings that are being compared? i checked the name of the reads in both the read1 and read2 fastq files and they are matching with /1 and /2 respectively for the read pairs?

    Also the length of the reads are the same in both the files.

    Any help will be appreciated.

    Thanks,
    N

  • #2
    maq + solid

    yeah, i'm getting the same error. i've not been able to figure it out yet but in the meantime i ran single ends and got 70% of the reads mappable. did you figure it out? i'd be interested in hearing about your problems.

    btw, to introduce myself, i'm der_eiskern and at the moment i'm doing whole genome sequencing with both SOLiD and Illumina platforms.

    cheers.

    Comment


    • #3
      Originally posted by der_eiskern View Post
      yeah, i'm getting the same error. i've not been able to figure it out yet but in the meantime i ran single ends and got 70% of the reads mappable. did you figure it out? i'd be interested in hearing about your problems.

      btw, to introduce myself, i'm der_eiskern and at the moment i'm doing whole genome sequencing with both SOLiD and Illumina platforms.

      cheers.
      Did you check the length of the quality strings? MAQ's convert script can output too long of a quality string if there are "-1" qualities. This fixed the problem for me.

      Comment


      • #4
        the quality strings in my fastq files are 50 bp and my read length is 50 bp. I didn't generate the files myself. just my task to run them. i haven't written a bash script to check the length of every single read though...so would just a single aberrant length stop maq completely before it begins?

        any other ideas, nils? I'm scratching my head because maq gives me the error suggesting the two reads are of different lengths still. I'm hoping that my 70% Mappability will increase when i can get paired end assignments working.

        thanks.

        Comment


        • #5
          hi der_eiskern,

          Yeah repeating what nilshomer mentioned ...

          Yes i figured out what the problem is. I'm assuming it would be the same problem for you.

          The *.qual files containing the qualities for both the F3 and R3 reads have negative values, mainly -1. So when the solid2fastq.pl creates the fastq files it does not handle these negative values correctly (treating the "-" and "1" as separate entities) and the length of the quality string is not equal to the length of the read string.

          You would have to change the script a bit to handle this problem.

          hope this helps.

          N

          Comment


          • #6
            Originally posted by nisha View Post
            hi der_eiskern,

            Yeah repeating what nilshomer mentioned ...

            Yes i figured out what the problem is. I'm assuming it would be the same problem for you.

            The *.qual files containing the qualities for both the F3 and R3 reads have negative values, mainly -1. So when the solid2fastq.pl creates the fastq files it does not handle these negative values correctly (treating the "-" and "1" as separate entities) and the length of the quality string is not equal to the length of the read string.

            You would have to change the script a bit to handle this problem.

            hope this helps.

            N
            This is what I mentioned above. I have emailed Heng Li (MAQ's author) about the problem, but it should be a one liner in his code.

            Comment


            • #7
              thanks. this is a part of an output where i printed every quality string

              (*)&-"-"-",&&'&''*)',&&,-"1&+&))))1-"&&)/-")&',&(&-)-"-"'
              <;>=-"-"-"7=><1=>?:<>;>A-"=;<88=:=6-"=0:<-">8?(;9;8,-"-",
              5&,;-"-"-":&/,/(($-8,)5/-"1)((&+&,'-")$($-"/&0''(/28-"-"$
              $##&-"-"-"%&'##'###%%###-"#1####&##-"##$#-"$#####$#$-"-"#
              /,&&-"-"-",5<,*,<&1)',+/-",5,/&'/),-"//,4-"&&)7+&),)-"-"&
              $/&&-"-"-"81/@*,>),)3)(,-"<(>/'/)-",&-"&,<1).&&8-"-"2
              :?>8-"-"-">?8;0>6;:.>9=6-";98>/6$%9-")%47-"+#1;/)'.7-"-".
              =<A?-"-"-"==><<0>1;@89>=-"579;A==>3-"1<79-":<=37)55;-"-"<
              8;<<-"-"-":;8<86526<8,;<-"891:76,,9-"7037-"5.+:;1;65-"-"9
              <::A-"-"-"9A@:9<<><8==@5-"8;;5:89;6-"<05:-"9<2=)8>68-"-"=
              #&$#-"-"-"$''%##$'$'$-(&-"$&&#'#-#'-"#%&.-"$''#%%&(%-"-"*

              yep. there's all those "-1" improperly translated. is there a way that i can correct these files without retranslating everything to fastq?

              nisha, what was your way around this if you didn't change the script?

              thanks.

              Comment


              • #8
                Originally posted by der_eiskern View Post
                thanks. this is a part of an output where i printed every quality string

                (*)&-"-"-",&&'&''*)',&&,-"1&+&))))1-"&&)/-")&',&(&-)-"-"'
                <;>=-"-"-"7=><1=>?:<>;>A-"=;<88=:=6-"=0:<-">8?(;9;8,-"-",
                5&,;-"-"-":&/,/(($-8,)5/-"1)((&+&,'-")$($-"/&0''(/28-"-"$
                $##&-"-"-"%&'##'###%%###-"#1####&##-"##$#-"$#####$#$-"-"#
                /,&&-"-"-",5<,*,<&1)',+/-",5,/&'/),-"//,4-"&&)7+&),)-"-"&
                $/&&-"-"-"81/@*,>),)3)(,-"<(>/'/)-",&-"&,<1).&&8-"-"2
                :?>8-"-"-">?8;0>6;:.>9=6-";98>/6$%9-")%47-"+#1;/)'.7-"-".
                =<A?-"-"-"==><<0>1;@89>=-"579;A==>3-"1<79-":<=37)55;-"-"<
                8;<<-"-"-":;8<86526<8,;<-"891:76,,9-"7037-"5.+:;1;65-"-"9
                <::A-"-"-"9A@:9<<><8==@5-"8;;5:89;6-"<05:-"9<2=)8>68-"-"=
                #&$#-"-"-"$''%##$'$'$-(&-"$&&#'#-#'-"#%&.-"$''#%%&(%-"-"*

                yep. there's all those "-1" improperly translated. is there a way that i can correct these files without retranslating everything to fastq?

                nisha, what was your way around this if you didn't change the script?

                thanks.
                If it is in the .bfq format, you will have to convert it back to fastq (since the .bfq is gzip compressed).

                You can always modify the input "qual" files using "sed":

                Code:
                sed -i 's_-1_1_g' <QV file>

                Comment


                • #9
                  Originally posted by nilshomer View Post
                  If it is in the .bfq format, you will have to convert it back to fastq (since the .bfq is gzip compressed).

                  You can always modify the input "qual" files using "sed":

                  Code:
                  sed -i 's_-1_1_g' <QV file>
                  Thanks! we don't have the original qual files unfortunately, can i apply this command to the fastq files i have? or are they beyond help?

                  Comment


                  • #10
                    Originally posted by der_eiskern View Post
                    Thanks! we don't have the original qual files unfortunately, can i apply this command to the fastq files i have? or are they beyond help?
                    You can try to modify the fastq files. The only problem is that -1 encoded in sanger ASCII is -", and both - and " are also sanger ASCII (I believe). So some fraction of the time -" will occur not from one -1 quality but two independent qualities. Therefore it is fairly tricky, unless you try to match up the -1 qualities with the missing color (which is usually the case). This is starting to sound like a lot of work!

                    Did you delete the original qual files? How did you get the fastq file in the first place?

                    Comment


                    • #11
                      Originally posted by nilshomer View Post
                      So some fraction of the time -" will occur not from one -1 quality but two independent qualities. Therefore it is fairly tricky, unless you try to match up the -1 qualities with the missing color (which is usually the case). This is starting to sound like a lot of work!

                      Did you delete the original qual files? How did you get the fastq file in the first place?
                      Our SOLiD data came from offsite and they did they're own SNPcalling using the Corona Lite pipeline and gave us converted qual files in the fastq format for us to run MAQ on. Email communication has been slow...its looking like i'll have to pay them a visit to get all this straightened out.

                      Comment


                      • #12
                        Originally posted by der_eiskern View Post
                        Our SOLiD data came from offsite and they did they're own SNPcalling using the Corona Lite pipeline and gave us converted qual files in the fastq format for us to run MAQ on. Email communication has been slow...its looking like i'll have to pay them a visit to get all this straightened out.
                        I am a bit confused. So they aligned the reads and made variant calls using corona-lite, and then gave you the raw color data (fastq)? Why don't they just give you the variant calls and alignments?

                        I would definitely ask for the *csfasta and *qual files in this case and do your own alignment and SNP calling...

                        Comment


                        • #13
                          Originally posted by nilshomer View Post
                          I am a bit confused. So they aligned the reads and made variant calls using corona-lite, and then gave you the raw color data (fastq)? Why don't they just give you the variant calls and alignments?

                          I would definitely ask for the *csfasta and *qual files in this case and do your own alignment and SNP calling...
                          yeah, that's what i've been trying to do with MAQ and have been rather successful with the "homozygous" calls (using the flawed data they gave us) but not so much for the hets. i'm going to have to redo all of it though. thanks again for your help, nils.

                          Comment

                          Latest Articles

                          Collapse

                          • seqadmin
                            Current Approaches to Protein Sequencing
                            by seqadmin


                            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                            04-04-2024, 04:25 PM
                          • seqadmin
                            Strategies for Sequencing Challenging Samples
                            by seqadmin


                            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                            03-22-2024, 06:39 AM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by seqadmin, 04-11-2024, 12:08 PM
                          0 responses
                          18 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-10-2024, 10:19 PM
                          0 responses
                          22 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-10-2024, 09:21 AM
                          0 responses
                          16 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-04-2024, 09:00 AM
                          0 responses
                          47 views
                          0 likes
                          Last Post seqadmin  
                          Working...
                          X