Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problems with displaying reads in SAM format in IGV - Integrative Genome Viewer

    Hi!
    I have my doubts about the IGV displaying reads in sam-format correctly.

    The reads in sam-format all have the same length (90) but appear like being of different lengths when I upload them in IGV. The same happend with the same file in BAM format.
    I converted the SAM-file to a BED-file using Pyicos. Uploading this BED-file results into a correct display of the reads. See uploaded SAM-file (upper track) and uploaded BED-file (lower track) in attachment.
    But when BED is displayed, I can obviously not see the actual sequences of the reads (to look for polymorphisms).

    I am using the latest version of IGV: IGV_2.2.5

    Did anyone see this issue as well and knows a way around it?

    Below I give the SAM-file and the BED-file, as well as the command for the conversion.
    Any help is highly appreciated!
    Thanks!
    Sonja






    # here the command to convert sam to bed:
    pyicos convert input.sam output.bed -f sam -F bed

    # here the SAM-file
    ERR045703.5732207 99 19 16149684 60 75M15S = 16150110 508 AACGGGCACCCAGTGAGCACTCGAGGATGACCCTCCTCGGGCAGCTGCCGCCCACCCAGTAGCGACTGTCCCCAAGTCAGCAGGGAGGGA B5@9FGGIFHHIIEFAFFHHKI@IKJIHG@EJJKIJKI?GGB>CGBHGJ@HEABAD?B@DCDC898?>GCGGBH################ X0:i:1 X1:i:0 XC:i:75 MD:Z:75 RG:Z:ERR045703 AM:i:37 NM:i:0 SM:i:37 MQ:i:60 XT:A:U BQ:Z:@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
    ERR045704.58587335 163 19 16149690 60 37M53S = 16150123 494 CACCCAGGGAGCACTCGAGGATGACCCTCCTCGGGCAGCTGCCGCCCACCCTGTAGCGACTGTCCCCAAGTGAGCAGGGAGGGAAGCGAG @EC3<B=',EF/GBFE:7E<C->*416@A@8B;+$@###################################################### X0:i:1 X1:i:0 XC:i:37 MD:Z:7T29 RG:Z:ERR045704 AM:i:37 NM:i:1 SM:i:37 MQ:i:60 XT:A:U BQ:Z:@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
    ERR045703.17024081 147 19 16149699 60 8S82M = 16149269 -511 ACCCAGTGAGCACTCGAGGATGACCCTCCTCGGGCAGCTGCCGCCCACCCAGTAGCGACTGTCCCCAAGTCAGCAGGGAGGGAAGAGAGC #########A=??=AFGD>?@37E=?9=:=4@@BBC=:=DA?CGADADGJKHJKJAFJJJFBGJGCGHEKIJIIGHHAHGGFGHDGCFC@ X0:i:1 X1:i:0 XC:i:82 MD:Z:82 RG:Z:ERR045703 AM:i:37 NM:i:0 SM:i:37 MQ:i:60 XT:A:U BQ:Z:@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
    ERR045704.7021863 147 19 16149705 60 7S83M = 16149294 -493 GACCACTCGAGGATGCCCCTCCTCGGGCAGCTGCCGCCCACCCAGTAGCGACTGTCCCCAAGTCAGCAGGGAGGGAAGAGAGCAGGTCAC ########EEHGE?E?D<EFD>EEFDGEDFE?<BF?BFGIJKEIGG>IJJIGFHKHHFKHLJKJJIIIIJGIIJJHGGHGGDGCFEH@ X0:i:1 X1:i:0 XC:i:83 MD:Z:8A74 RG:Z:ERR045704 AM:i:37 NM:i:1 SM:i:37 MQ:i:60 XT:A:U BQ:Z:@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
    ERR045704.32772245 163 19 16149726 60 90M = 16150187 499 AGCTGCCGCCCACCCAGTAGCGACTGTCCCCAAGTCAGCAGGGAGGGAAGAGAGCAGGTCACGCTCTCCTAAGTCTGATCAAGCAGCCGT @EDFFFG?GEHIFGHIJHHIH@FFKJIHJJJJLIEFIKJKJKI?FHHJIFJEGEEGFCECHB=EEDFEFHFHHGFIH,:CGIHH>@EC<F X0:i:1 X1:i:0 MD:Z:90 RG:Z:ERR045704 AM:i:37 NM:i:0 SM:i:37 MQ:i:60 XT:A:U BQ:Z:@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@A
    ERR045704.57426661 163 19 16149741 60 90M = 16150165 492 AGTAGCGACTGTCCCCAAGTCAGCAGGGAGGGAAGAGAGCAGGTCACGCTCTCCTAAGTCTGATCAAGCAGCCGTGCAGAGATGTGCCTC @EFEFF>HFHHGHGHIJ@GGGIIJJKKJGIBB??CHEJFFJHG@@DC>HIEGIJFC@FDCGFFHEIBFFGFGG=BBAEDJHHGIH;C?HD X0:i:1 X1:i:0 MD:Z:90 RG:Z:ERR045704 AM:i:37 NM:i:0 SM:i:37 MQ:i:60 XT:A:U BQ:Z:@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@BB
    ERR045703.16563758 83 19 16149748 60 90M = 16149339 -498 ACTGTCCCCAAGTCAGCAGGGAGGGAAGAGAGCAGGTCACGCTCTCCTAAGTCTGATCAAGCAGCCGTGCAGAGATGTGCCTCTCACCTA EAGDDFGEHBIFIGHHGIGGEJDIEHIHIHIDHKJHHHHAKHJKJKKJKKELKJIHJJKKGJIJJAHHAJKIJIJIGIH@HIHGCCACB6 X0:i:1 X1:i:0 MD:Z:90 RG:Z:ERR045703 AM:i:37 NM:i:0 SM:i:37 MQ:i:60 XT:A:U BQ:Z:@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
    ERR045704.56912904 147 19 16149756 60 13S77M = 16149309 -523 TAGCGACCGTCCCCAAGTCAGCAGCGAGGGAAGAGAGCAGGTCACGCTCTCCTAAGTCTGATCAAGCAGCCGTGCAGAGATGTGCCTCTC ##############AIEEFI>>43)<H@8FA9-GA02BD:46?E7ED<.6,.>1.=ECC,97E4C8ECA2'2-'EG<9,,;BHG=?GGD; X0:i:1 X1:i:0 XC:i:77 MD:Z:11G65 RG:Z:ERR045704 AM:i:37 NM:i:1 SM:i:37 MQ:i:60 XT:A:U BQ:Z:@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
    ERR045704.58477016 147 19 16149772 60 37S53M = 16149322 -502 CCACCCCGCAGCGACGGTCCCCAAGTCAGCAGGGAGGGAAGAGAGCAGGTCACGCTCTCCTAAGTCTGATCAAGCGGCCGTGCAGAGATG ######################################?F@?4DD9@IDG=5';GEC=(7C4HB8GHCIF<(C=&%%$4*[email protected]/ X0:i:1 X1:i:0 XC:i:53 MD:Z:38A14 RG:Z:ERR045704 AM:i:37 NM:i:1 SM:i:37 MQ:i:60 XT:A:U BQ:Z:@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@




    # here is the BED-file
    19 16149684 16149773 ERR045703.5732207 0 +
    19 16149690 16149779 ERR045704.58587335 0 +
    19 16149699 16149788 ERR045703.17024081 0 -
    19 16149705 16149794 ERR045704.7021863 0 -
    19 16149726 16149815 ERR045704.32772245 0 +
    19 16149741 16149830 ERR045704.57426661 0 +
    19 16149748 16149837 ERR045703.16563758 0 -
    19 16149756 16149845 ERR045704.56912904 0 -
    19 16149772 16149861 ERR045704.58477016 0 -
    Attached Files

  • #2
    It looks like IGV is just not showing the soft-clipped regions. That would be the appropriate behaviour.

    Comment


    • #3
      soft-clip

      Thanks for your answer!
      what are soft-clipped regions, please?

      Comment


      • #4
        Can't find your 'S" ? 'S' operator documentation is here
        ...
        See CIGAR 'S' (soft clip) operation in sam/bam format documentation at sourceforge : http://samtools.sourceforge.net/SAM1.pdf

        As far as that recently added operator, don't 'X' me.

        Comment


        • #5
          does it basically mean that the base quality is very low in these regions?

          Comment


          • #6
            Originally posted by sonja View Post
            does it basically mean that the base quality is very low in these regions?
            In the case of your reads, at least, yes. You can see that from the quality scores (often #).

            Comment


            • #7
              yes, that´s exactly what I see!
              Thanks a lot!!!

              Comment


              • #8
                You can optionally display the soft-clipped reads, its a user preference (View > Preferences > Alignments tab).

                Comment


                • #9
                  great! I was looking for that... Thanks!!!

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM
                  • seqadmin
                    Techniques and Challenges in Conservation Genomics
                    by seqadmin



                    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                    Avian Conservation
                    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                    03-08-2024, 10:41 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Yesterday, 06:37 PM
                  0 responses
                  10 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, Yesterday, 06:07 PM
                  0 responses
                  9 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-22-2024, 10:03 AM
                  0 responses
                  49 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-21-2024, 07:32 AM
                  0 responses
                  67 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X