Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • emilyjia2000
    Member
    • May 2011
    • 59

    .SAM to .BAM with SAM file header @PG

    Hi
    I used export2sam.pl to convert export.txt to .sam. I checked the newly generated SAM file with header @PG. When I tried to use command line, like
    " samtools view -b in.sam -o out.bam "
    to generate BAM file, it occurs errors:

    [bam_header_read] EOF marker is absent. The input is probably truncated.
    [bam_header_read] invalid BAM binary header (this is not a BAM file).
    [main_samview] fail to read the header from "in.sam".

    Does anybody know what's wrong with it? What command line I should use for converting SAM to BAM

    Thanks
  • Richard Finney
    Senior Member
    • Feb 2009
    • 701

    #2
    use -S parameter

    Usage: samtools view [options] <in.bam>|<in.sam> [region1 [...]]

    Options: -b output BAM
    -h print header for the SAM output
    -H print header only (no alignments)
    -S input is SAM
    -u uncompressed BAM output (force -b)
    -1 fast compression (force -b)
    -x output FLAG in HEX (samtools-C specific)
    -X output FLAG in string (samtools-C specific)
    -c print only the count of matching records
    -L FILE output alignments overlapping the input BED FILE [null]
    -t FILE list of reference names and lengths (force -S) [null]
    -T FILE reference sequence file (force -S) [null]
    -o FILE output file name [stdout]
    -R FILE list of read groups to be outputted [null]
    -f INT required flag, 0 for unset [0]
    -F INT filtering flag, 0 for unset [0]
    -q INT minimum mapping quality [0]
    -l STR only output reads in library STR [null]
    -r STR only output reads in read group STR [null]
    -? longer help

    Comment

    • emilyjia2000
      Member
      • May 2011
      • 59

      #3
      Hi Richard,

      I do want to convert SAM to BAM, it output error when I used "samtools view -b in.sam -o out.bam". I checked the header of SAM file, it comes with @PG. I don't know how to deal with it?

      Thanks

      Comment

      • kmcarr
        Senior Member
        • May 2008
        • 1181

        #4
        Originally posted by emilyjia2000 View Post
        Hi Richard,

        I do want to convert SAM to BAM, it output error when I used "samtools view -b in.sam -o out.bam". I checked the header of SAM file, it comes with @PG. I don't know how to deal with it?

        Thanks
        Emily,

        As Richard said you need to us the -S option (in addition to your other options) to tell samtools view that the INPUT is in SAM format. By default samtools view expects a BAM file as input but you are giving it a SAM file, that's what is causing an error.

        Comment

        • SDBP
          Member
          • Jan 2011
          • 12

          #5
          I am dealing with the same kind of SAM files - header @PG.
          I tried -S option, it didn't work.
          First I saw the segmentation fault. When I fixed that and ran

          samtools view -bt my.fa.fai my.sam > my.bam - It showed the following

          [sam_read1] reference 'chr3.fa' is recognized as '*'.
          [sam_read1] reference 'chr1.fa' is recognized as '*'.
          [sam_read1] reference 'chr19.fa' is recognized as '*'.
          [sam_read1] reference 'chr3.fa' is recognized as '*'.

          Then I did a sed s/.fa// on the input file before doing export2sam.pl and ran export2sam.pl, it throws the following errors:

          ERROR: Unexpected number of fields in export record on line 285 of read1 export file. Found 21 fields but expected 22.
          ...erroneous export record:
          ABC-GA2 1 4 1 3 1347 0 1 TTTCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN_BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB QC

          Any insight will be helpful.
          Any other SAM to BAM tools known for sam files with @PG ?????

          Comment

          • Richard Finney
            Senior Member
            • Feb 2009
            • 701

            #6
            What is the full command you are using for export2sam.pl ?
            Beware that the input is supposed to be a "GERALD" type of file (also know as "illumina export file").

            Comment

            • SDBP
              Member
              • Jan 2011
              • 12

              #7
              perl export2sam.pl --read1=my_export.txt > my_export.sam

              Comment

              • Richard Finney
                Senior Member
                • Feb 2009
                • 701

                #8
                What version of samtools?

                Comment

                • SDBP
                  Member
                  • Jan 2011
                  • 12

                  #9
                  samtools-0.1.16

                  Comment

                  • Richard Finney
                    Senior Member
                    • Feb 2009
                    • 701

                    #10
                    The perl code is ...

                    if(scalar(@t) < EXPORT_SIZE) {
                    my $msg="\nERROR: Unexpected number of fields in export record on line $line_no of read$read_no export file. Found " . scalar(@t) . " fields but expected " . EXPORT_SIZE . ".\n";
                    $msg.="\t...erroneous export record:\n" . $line . "\n\n";
                    die($msg);

                    EXPORT_SIZE is 22 ( EXPORT_SIZE => 22 )

                    It's complaining that line 285 has only 21 fields.

                    What are on lines 284 and 285 ?

                    Comment

                    • SDBP
                      Member
                      • Jan 2011
                      • 12

                      #11
                      Line 284:

                      ABC-DE2 1 4 1 3 119 0 1 GAGTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB QC N


                      Line 285:
                      ABC-DE2 1 4 1 3 1347 0 1 TTTCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN fa_BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB QC N


                      I see that there is the extra 'fa' in line 285.....
                      can I try deleting it?
                      will deleting it work?

                      Comment

                      • SDBP
                        Member
                        • Jan 2011
                        • 12

                        #12
                        Sorry, the above was from the file where I did not remove the .fa

                        Below is from the file which I am working on:

                        ABC-DE2 1 4 1 3 1347 0 1 TTTCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN_BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB QC

                        Comment

                        • SDBP
                          Member
                          • Jan 2011
                          • 12

                          #13
                          On another line I see :

                          ABC-DE2 1 4 1 3 1978 0 1 CAATNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN _]_BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB QC

                          This way I will have to go through the whole file?

                          Comment

                          • Richard Finney
                            Senior Member
                            • Feb 2009
                            • 701

                            #14
                            Hmmmm.

                            You might be messing up with the sed command:

                            sed s/.fa//

                            that's saying change "anychar+f+a" to nothing.

                            "f" and "a" appear to be legitimate GERALD (or whatever, "export") quality value, so they'll get unintentionally changed to null , as well as the intended strings likes "chr1.fa" --> "chr1"

                            Glance at the input file for legitimate quality values (the field after the sequence field)

                            In sed language , putting a backslash before dot (i.e. \. ) means "period" to distinguish from the sole dot (i.e. .) which means "any character".
                            Last edited by Richard Finney; 06-14-2011, 12:24 PM.

                            Comment

                            Latest Articles

                            Collapse

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by SEQadmin2, Today, 10:09 AM
                            0 responses
                            9 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, Yesterday, 08:59 AM
                            0 responses
                            16 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, 06-02-2026, 12:03 PM
                            0 responses
                            24 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, 06-02-2026, 11:40 AM
                            0 responses
                            21 views
                            0 reactions
                            Last Post SEQadmin2  
                            Working...