Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • bowtie2 output, sam to bam conversion error.

    Hi all,

    I am trying convert a sam file (PE bowtie2 default fr output) to bam file using samtools view command. But it's throwing the following error.

    Code:
    $ samtools view -bS -o aln3.bam aln3.sam
    [samopen] SAM header is present: 17 sequences
    [sam_read1] reference '451450' is recognized as '*'
    Parse error at line 2026299: sequence and quality are inconsistent
    When I looked at the partcular line,
    Code:
    sed -n 2026299p aln3.sam
            133     chr12   451450  0       *       =       451450  0       *       *       YT:Z:UP YF:Z:LN
    Its missing col1 (Qname) also CIGAR string, segment sequence and Quality scores are unavailable.

    When I checked for such lines
    Code:
    $ sed -n '/^[^H]/=' aln3.sam | wc -l
    5422
    There were 5422 lines.

    How do I proceed now?? Do I delete these 5422 lines from sam file?? This thread here had the same problem but it was with bwa aligner.


    Also my bowtie output after aligning says (many such lines are produced before showing final alignment stats):

    Code:
    Warning: skipping mate #2 of read '' because length (0) <= # seed mismatches (0)
    Warning: skipping mate #2 of read '' because it was < 2 characters long
    I am not able interprte this. Also my alignement stats says 99.97% pairs were aligned concordantly 0 times. I am bit amused.
    Code:
    1018542 reads; of these:
      1018542 (100.00%) were paired; of these:
        1018249 (99.97%) aligned concordantly 0 times
        188 (0.02%) aligned concordantly exactly 1 time
        105 (0.01%) aligned concordantly >1 times
    How do I proceed now??

    Thank you.

    Between I am new to analysis and I hope you people dont mind if my questions are too simple.
    Last edited by a_mt; 11-16-2012, 11:41 PM.

  • #2
    Skipping Mate errors

    Hello. I have the same issue

    Warning: skipping mate #1 of read 'SN860:381:H80WNADXX:1:1101:18171:2118 1:N:0:1' because length (0) <= # seed mismatches (0)
    Warning: skipping mate #1 of read 'SN860:381:H80WNADXX:1:1101:18171:2118 1:N:0:1' because it was < 2 characters long

    Comment


    • #3
      It sounds like you have a 0bp read, possibly due to quality trimming. You can try removing such reads (and their mates) with a tool that allows you to specify a minimum read length.

      Comment


      • #4
        I solved this problem after trimming fastaq files with fastx_trimmer -f and fastx_trimmer -t operations with no 0bp read in output files. So the solution for me was to remove artifacts with fastx_artifacts_filter and then synchronize pairs with fastqCombinePairedEnd.py

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin




          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
          04-22-2024, 07:01 AM
        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 11:49 AM
        0 responses
        15 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-24-2024, 08:47 AM
        0 responses
        16 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        61 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        60 views
        0 likes
        Last Post seqadmin  
        Working...
        X