Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    I noticed the same problem when running pileup under SAMtools-0.1.15. However the problem does not seem to occur when running pileup under SAMtools-0.1.4 (using the same reference file, same BAM file and same command line options).

    samtools-0.1.4/samtools pileup -s -f reference.fa sorted.bam > pileup.out

    Comment


    • #17
      Originally posted by SMHfrog View Post
      I had this same problem, and after seeing no solution here did some more digging, and have a possible solution for you.

      I noticed that the ref.fa.fai file for my whole genome was 0 kb. The .fai is used by samtools when building the pileup. When I ran the command to re-build the .fai:

      samtools faidx reference.fa

      I got the following error message:

      [fai_build_core] different line length in sequence 'scaffold_14'.
      Segmentation fault

      No doubt this same message occurred the first time I ran the pileup command (which also builds the .fai if it doesn't exist), but I apparently didn't pay attention. After that first time, the .fai file EXISTED so no errors were subsequently reported when I ran pileup again.

      In my case, there was an extra line after scaffold_14. I removed this, and re-built the .fai using the samtools faidx command and then re-ran the pileup command. My pileup then contained the reference base as intended!

      Hope this helps y'all find the solution to your problem.
      Best,
      Shannon
      University of Texas at Austin
      Hi all,
      I have the same error.
      samtools faidx bwa.ref/ref.fasta ref.fa

      ERROR:
      different line length in sequence 'scaffold_67'.
      Segmentation fault
      NOTE: I see NNNN in that scaffold . Does anyone have a suggestion?

      Comment


      • #18
        Hi everybody,

        I am have similar problems with samtools 0.1.18. I would like to have reference characters listed in a pileup files, but I have problems with headers.

        samtools faidx AGSbrut.fasta
        samtools view -q 20 -buh -t AGSbrut.fasta.fai A.sam | samtools sort - A
        samtools view -q 20 -buh -t AGSbrut.fasta.fai S.sam | samtools sort - S
        samtools mpileup -B -f AGSbrut_index.fai A.bam S.bam > AS.mpileup

        [fai_build_core] different line length in sequence 'null'.
        Segmentation fault

        I hypothesized that this 'null' sequence may be a blank line; so I looked for it manually and with sed, with no luck. I also looked for other potential problems based on what was previously reported (no extra spaces, characters, etc in reference sequence names in fai and sam files). I also tried to re-head the file, with no success:

        samtools view -HS -t AGSbrut.fasta.fai A.sam > Aheader.sam
        samtools reheader Aheader.sam A.bam > Aheaded.bam

        [bam_header_read] EOF marker is absent. The input is probably truncated.

        All insights are welcome!
        thank you, eric

        Comment


        • #19
          Originally posted by colindaven View Post
          Here's another possible solution - the headers are not consistent between SAM/BAM and the original fasta:

          Even though the reference file was the same one in both cases, sometimes aligners just write a substring out into the SAM file. Samtools seems to take the full header.

          For example the first contiguous part of my genome header is
          gi|110645304|ref|NC_002516.2|

          However in my SAM file the aligner has only written
          NC_002516.2

          Samtools has written the full header to the .fa.fai index
          gi|110645304|ref|NC_002516.2|

          .. and this does not match.

          Solution:

          Try correcting the original header on the reference fasta to just the substring which the aligner uses.
          eg
          gi|110645304|ref|NC_002516.2|
          to
          NC_002516.2
          The above suggestion fixed the problem when I got this error

          Comment


          • #20
            Hey folks,

            Have been struggling to figure out why I am getting N's for my pileup reference sequence. I found hope when I discovered this string but I have followed all the suggestions to no avail. I've tried this with different versions of samtools, different data sets, different reference files and have simplified ID names, rebuilt the faidx index, etc. etc.

            Still can't figure out what's going on here. Has anyone found any other solutions?

            Thanks

            Comment


            • #21
              Using pileup with the -f argument allows you to supply the faidx indexed reference sequence file. I used this option and it fixed my problem.

              Comment


              • #22
                Same problem

                Did you find a solution to the null problem please?

                Originally posted by ericpante View Post
                Hi everybody,

                I am have similar problems with samtools 0.1.18. I would like to have reference characters listed in a pileup files, but I have problems with headers.

                samtools faidx AGSbrut.fasta
                samtools view -q 20 -buh -t AGSbrut.fasta.fai A.sam | samtools sort - A
                samtools view -q 20 -buh -t AGSbrut.fasta.fai S.sam | samtools sort - S
                samtools mpileup -B -f AGSbrut_index.fai A.bam S.bam > AS.mpileup

                [fai_build_core] different line length in sequence 'null'.
                Segmentation fault

                I hypothesized that this 'null' sequence may be a blank line; so I looked for it manually and with sed, with no luck. I also looked for other potential problems based on what was previously reported (no extra spaces, characters, etc in reference sequence names in fai and sam files). I also tried to re-head the file, with no success:

                samtools view -HS -t AGSbrut.fasta.fai A.sam > Aheader.sam
                samtools reheader Aheader.sam A.bam > Aheaded.bam

                [bam_header_read] EOF marker is absent. The input is probably truncated.

                All insights are welcome!
                thank you, eric

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM
                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 06:37 PM
                0 responses
                12 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 06:07 PM
                0 responses
                10 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-22-2024, 10:03 AM
                0 responses
                52 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-21-2024, 07:32 AM
                0 responses
                68 views
                0 likes
                Last Post seqadmin  
                Working...
                X