Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • sam 2 bam conversion error

    Hi all, Just aligned some exome sequencing data for an individual, and when converting from sam to bam I get this error:

    samtools view -h -S -b -o 1001-1.bam 1001-1.sam
    [samopen] SAM header is present: 84 sequences.
    Parse error at line 90: sequence and quality are inconsistent
    Abort trap: 6

    I looked at line 90 in the sam file:

    B02V0ACXX110721:1:1207:10678:69422 141 * 0 0 * * 0 0 GGTTAGGGTTAGAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGGTAGGGGGTTAGGGTTAG !!#'%'%''%%)(*(
    $$'&)"$()**&''#")&% %(&

    I noticed there were no alignment coordinates. I checked for a bunch of entries in the samfile within the first 1000 lines. There were no alignment coordinates for any of them. Could it be something to do with my reference? I know this data is good as it has been aligned and analysed by others before me. I have been having trouble aligning other data also. Only thing in common is software and reference (reference downloaded from GATK resource bundle, indexed with bwa index).

    Any suggestions are welcome.

    Cheers,
    Davy

  • #2
    generally the sequence length should be equal to the quality string length. In your read you have 76 nucleotides and only 38 character for the quality string....probably there are some problem with you input fastq.

    Comment


    • #3
      Hello Everyone,
      I am having trouble converting the sam files to bam using samtools. I used this cmd to convert my sam to bam -

      samtools view -bT hg19.fa s_chip2.sam > s_chip2.bam
      I got this error.
      Parse error at line 7707082: sequence and quality are inconsistent
      Aborted


      I ran ValidateSamFile.jar, a picard tool and got the following error, hundreds of them -

      WARNING: Read name HWI-ST798R:82: D18MUACXX:2:1101:2025:1987, A record is missing a read group
      ERROR: Record 5, Read name HWI-ST798R:82: D18MUACXX:2:1101:8625:1992, Empty sequence dictionary.

      I am not sure how to fix this, any help is appreciated.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin


        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
        Today, 07:01 AM
      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      37 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      41 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      35 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      54 views
      0 likes
      Last Post seqadmin  
      Working...
      X