Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • paired end reads don't have the same length

    Dear all,

    My genomes were sequenced by illumina hiseq in paired end. I was running cortex but it died in SAM line1 says the list of paired end reads don't have the same length and quit. How should I process it?

    Thank you!
    Last edited by fnn4; 08-08-2014, 05:36 AM.

  • #2
    You should re-obtain the raw read files that have the correct number of reads, and start with those. Then, you'll have to clarify what you are trying to do next.

    Comment


    • #3
      Originally posted by Brian Bushnell View Post
      You should re-obtain the raw read files that have the correct number of reads, and start with those. Then, you'll have to clarify what you are trying to do next.
      Thank you for replying!
      I'm using files of raw reads though. How can I make them the same length?
      Thank you!!

      Comment


      • #4
        Can you post the head, tail, and wordcounts of the two files?

        e.g. the output of these commands:

        head -n 8 read1.fastq
        tail -n 8 read1.fastq
        wc read1.fastq

        head -n 8 read2.fastq
        tail -n 8 read2.fastq
        wc read2.fastq

        Comment


        • #5
          Brian: Sounds like some of the reads are not the same length (total number may be the same). Perhaps "cortex" (not sure what that is) requires them to be.

          Comment


          • #6
            The files sizes are the same when they are in fastq, but not the same when they are gziped. But even when I use fastq files, it still quit saying paired end reads don't have the same length, die SAM line 2.
            below is one's sample's head, tail wc.
            head -n 8 E_F.fastq
            @HWI-ST1360:45:C1JK9ACXX:1:1101:3383:2427 1:N:0:TGACCA
            TCAATTCGACTGGGTACGACCACGTAAGACATGGTTAGATGCAAAATGTCCAGTGTATATTGATTTTGGTGACGAACACCTTGTAAAACTAGATACATATG
            +
            BBBFFFFFFFFFFIFFIIIIIIIIFIIIIIIIIIFIIIIIIIIIIIIIFIIFIFFFFIIIIIIIIIIFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFF
            @HWI-ST1360:45:C1JK9ACXX:1:1101:3303:2477 1:N:0:TGACCA
            AGAGATAACCAACAGTTCGCGTTTGAATATCAAGACAATGAAGGCCAAACACAATCTAAAGCATTAACACTCAAAGTAGGCGATAGCCTTGAAGAAGTGGC
            +
            BBBFFFFFFFFFFIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFB

            tail -n 8 E_F.fastq
            @HWI-ST1360:45:C1JK9ACXX:1:2316:20760:100871 1:N:0:GGACCA
            TGTCATATCCGCAAAGCTCTGAAATGACATTATTTTCAATAGCCTCAACGGCTAATTGGTTGGCGCGGGCAGTATTGATACAAATACGATGCCCTTGCTCG
            +
            BB<FFFFFFFFFFBFFIIFFFFFFIIFIFFFFFFIB<FBFFFFIFFBBFFFF<FBFFFIIIIFBBBFFBB<<'0<BBFFFFBBBBBFB<BFBBBBBB<BBF
            @HWI-ST1360:45:C1JK9ACXX:1:2316:21212:100876 1:N:0:TGACCA
            TTTTCTAAAGCGATACAACCTTACAGGTAAACTGGAAGCCGGTTTGTATCACTTGTCCGTTGTATCTAAAGATAAACAAGTTTACAGAACTTGGGTGATAC
            +
            7<B<B<00<B0<<BFFFB00<BFBBB'0<0BF<07BFB0'<77BF<7BB'<BBBB70<7BBFBBFBB<BB<BBBBBB<<B<<<<7'0<<<''070077<<'

            wc E_F.fastq
            16531308 20664135 1084831013 E_F.fastq
            wc E_R.fastq
            16531308 20664135 1084831013 E_R.fastq

            Thank you!!

            Originally posted by Brian Bushnell View Post
            Can you post the head, tail, and wordcounts of the two files?

            e.g. the output of these commands:

            head -n 8 read1.fastq
            tail -n 8 read1.fastq
            wc read1.fastq

            head -n 8 read2.fastq
            tail -n 8 read2.fastq
            wc read2.fastq

            Comment


            • #7
              The fact that both files have the exact same number of lines and characters indicates that probably there is no problem with the input, but rather some kind of bug in cortex (which I have also never heard of) or an incorrect command line.

              What are you trying to do with the data?

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 03-27-2024, 06:37 PM
              0 responses
              13 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-27-2024, 06:07 PM
              0 responses
              11 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              53 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              69 views
              0 likes
              Last Post seqadmin  
              Working...
              X