Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SAM to CUFFLINKS SAM format

    Does any one know how to convert SAM to cufflinks SAM format ?
    Even manual is fine I can write a simple code that converts manual to command line.
    All I want to know is is the first sequence is really SAM or not (It should be). If it is what coulmn actually represents original SAM format?

    Thanx

    Code:
    IL26_1184:1:109:734:594	67	clone::AL662826.11:1:145431:1	27827	0	36M	*	0	80	GGCCGCTGTGCGCGCCCCGCCTGCTGGACCACTTCA	>>>>>>><<>>>>>>>>>>>>8<<>8,,<<3<<8<3	MF:i:18	Aq:i:0	NM:i:0	UQ:i:0	H0:i:3	H1:i:0
    IL26_1184:1:109:734:594	147	clone::AL662826.11:1:145431:1	27871	0	36M	*	0	-80	CTGCCGGCGTTGCTCAAGCTGGCCTGCGGAGGCGAC	7.6<4667<64<<47<<<<.<<<<2<<<<<<<<<<<	MF:i:18	Aq:i:0	NM:i:0	UQ:i:0	H0:i:3	H1:i:0
    Code:
    s6.25mer.txt-913508	16	chr1 4482736 255 14M431N11M * 0 0 \   CAAGATGCTAGGCAAGTCTTGGAAG IIIIIIIIIIIIIIIIIIIIIIIII NM:i:0 XS:A:-

  • #2
    It looks as if your first snippet could be SAM. Usually the UNIX command

    sort -k3,3 -k4,4n in.sam > out.sam

    will be sufficient for Cufflinks to accept a sam file.

    Comment


    • #3
      They are both SAM format however the second line contains the XS:A field. This field allows cufflinks to know which strand the RNA that produced this read came from. Cufflinks will not accept sam files that are not sorted and do not have this field. You can write a simple script to modify your sam file to include this information by taking the bit wise flag in field 2 where the strand information is stored and translating it.

      Hope that helps.

      Comment


      • #4
        But the data I have is Illumina-single end. As per my knowledge I lllumina still doesn't have strand specific data ??
        Anyways I did what you said but no use. see below

        after adding strand info
        Using this command
        head -6 3125_8.sam | awk '{if ($9 ~ /-/) {print $0"\t""XS:A:-"} else {print $0"\t""XS:A:+"}}'|sort -k 3,3 -k 4,4n
        IL6_3125:8:58:1625:1479 67 clone::AL662826.11:1:145431:1 1261 0 37M * 0 247 AAAAGGAGTAGGCAGGAAAACAGTCAATTATGGATTC ?BBCBBBB@<BBBBCB@A@?B?B>@A@B@BABB?B@? MF:i:18 Aq:i:0 NM:i:0 UQ:i:0 H0:i:4 H1:i:0 XS:A:+
        IL6_3125:8:37:57:1851 131 clone::AL662826.11:1:145431:1 1458 0 37M * 0 262 GTGAATTGGAGTCCTGNGTTTTATTTTCCTTTCCCAC AB?@BBBAB<@?AAB:!<<BBB@BBBB@;BBBB=ABA MF:i:18 Aq:i:0 NM:i:1 UQ:i:0 H0:i:0 H1:i:4 XS:A:+
        IL6_3125:8:58:1625:1479 147 clone::AL662826.11:1:145431:1 1471 0 37M * 0 -247 CTGAGTTTTATTTTCCTTTCCCACCTCAAACCCCACA @8???@<?:>@;<6@B:96BBB6BB>BAB;BBBB>BB MF:i:18 Aq:i:0 NM:i:0 UQ:i:0 H0:i:4 H1:i:0 XS:A:-
        IL6_3125:8:37:57:1851 83 clone::AL662826.11:1:145431:1 1683 0 37M * 0 -262 GAAGGACTTACTGAGATGGCTGCTCCCACTCTCCAGC BBACA?=BB=;BCABB9BC7AC9BAAA>AB5@/?CC@ MF:i:18 Aq:i:0 NM:i:0 UQ:i:0 H0:i:4 H1:i:0 XS:A:-
        IL6_3125:8:93:491:1573 67 clone::AL662826.11:1:145431:1 3983 0 37M * 0 221 CTGGAATACAGAGGTTTTCACGGAAGCCCAGGGGACC BCB?BBCCBBBBBBABCCBBC>>AB>ABA@;9>8>=B MF:i:18 Aq:i:0 NM:i:0 UQ:i:0 H0:i:3 H1:i:0 XS:A:+
        IL6_3125:8:93:491:1573 147 clone::AL662826.11:1:145431:1 4167 0 37M * 0 -221 CTCCCCCAGCCCAGGGGTCTGGCTTCCCCAGGAGGAC =;?>A?A>=9A@A?<5>?=@@AA?ABAAAA@BA@AAA MF:i:18 Aq:i:0 NM:i:0 UQ:i:0 H0:i:3 H1:i:1 XS:A:-
        And ran cufflinks and got this error

        cufflinks 1006_tester.sam
        cufflinks: /usr/lib64/libz.so.1: no version information available (required by cufflinks)
        [bam_header_read] EOF marker is absent.
        File 1006_tester.sam doesn't appear to be a valid BAM file, trying SAM...
        [10:52:52] Inspecting reads and determining fragment length distribution.
        SAM error on line 57: CIGAR op has zero length
        SAM error on line 71: CIGAR op has zero length
        SAM error on line 109: CIGAR op has zero length
        SAM error on line 206: CIGAR op has zero length
        SAM error on line 249: CIGAR op has zero length
        SAM error on line 290: CIGAR op has zero length
        SAM error on line 312: CIGAR op has zero length
        SAM error on line 354: CIGAR op has zero length
        SAM error on line 356: CIGAR op has zero length
        SAM error on line 360: CIGAR op has zero length
        SAM error on line 416: CIGAR op has zero length
        SAM error on line 455: CIGAR op has zero length
        SAM error on line 496: CIGAR op has zero length
        SAM error on line 502: CIGAR op has zero length
        SAM error on line 546: CIGAR op has zero length
        SAM error on line 566: CIGAR op has zero length
        SAM error on line 594: CIGAR op has zero length
        SAM error on line 668: CIGAR op has zero length
        SAM error on line 708: CIGAR op has zero length
        SAM error on line 714: CIGAR op has zero length
        SAM error on line 717: CIGAR op has zero length
        SAM error on line 744: CIGAR op has zero length
        SAM error on line 814: CIGAR op has zero length
        SAM error on line 824: CIGAR op has zero length
        SAM error on line 834: CIGAR op has zero length
        SAM error on line 866: CIGAR op has zero length
        SAM error on line 872: CIGAR op has zero length
        SAM error on line 875: CIGAR op has zero length
        SAM error on line 877: CIGAR op has zero length
        SAM error on line 901: CIGAR op has zero length
        SAM error on line 912: CIGAR op has zero length
        SAM error on line 934: CIGAR op has zero length
        SAM error on line 940: CIGAR op has zero length
        SAM error on line 979: CIGAR op has zero length
        SAM error on line 994: CIGAR op has zero length
        SAM error on line 996: CIGAR op has zero length
        SAM error on line 999: CIGAR op has zero length
        > Processed 392 loci. [*************************] 100%
        > Map Properties:
        > Total Map Mass: 28.92
        > Read Type: 37bp single-end
        > Fragment Length Distribution: Gaussian (default)
        > Estimated Mean: 203.69
        > Estimated Std Dev: 75.10
        [10:52:53] Assembling transcripts and estimating abundances.
        > Processing Locus clone::AL662824.9:1:187964:?4 [* ] SAM error on line 1063: CIGAR op has zero length
        > Processing Locus clone::AL662824.9:1:187964:?4 [* ] SAM error on line 1077: CIGAR op has zero length
        > Processing Locus clone::AL662824.9:1:187964:?4 [* ] SAM error on line 1115: CIGAR op has zero length
        > Processing Locus clone::AL662824.9:1:187964:?4 [** ] 1SAM error on line 1212: CIGAR op has zero length
        > Processing Locus clone::AL662825.5:1:81768:1?4 [*** ] 1SAM error on line 1255: CIGAR op has zero length
        > Processing Locus clone::AL662825.5:1:81768:1?4 [***** ] 2SAM error on line 1296: CIGAR op has zero length
        > Processing Locus clone::AL662825.5:1:81768:1?4 [****** ] 2SAM error on line 1318: CIGAR op has zero length
        > Processing Locus clone::AL662825.5:1:81768:1?4 [******* ] 3SAM error on line 1360: CIGAR op has zero length
        > Processing Locus clone::AL662825.5:1:81768:1?4 [******* ] 3SAM error on line 1362: CIGAR op has zero length
        > Processing Locus clone::AL662825.5:1:81768:1?4 [******* ] 3SAM error on line 1366: CIGAR op has zero length
        > Processing Locus clone::AL662825.5:1:81768:1?4 [********* ] 3SAM error on line 1422: CIGAR op has zero length
        > Processing Locus clone::AL662825.5:1:81768:1?4 [********** ] 4SAM error on line 1461: CIGAR op has zero length
        > Processing Locus clone::AL662825.5:1:81768:1?4 [*********** ] 4SAM error on line 1502: CIGAR op has zero length
        > Processing Locus clone::AL662825.5:1:81768:1?4 [************ ] 4SAM error on line 1508: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [************* ] 5SAM error on line 1552: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [************** ] 5SAM error on line 1572: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [*************** ] 6SAM error on line 1600: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [**************** ] 6SAM error on line 1674: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [**************** ] 6SAM error on line 1714: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [***************** ] 6SAM error on line 1720: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [***************** ] 6SAM error on line 1723: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [****************** ] 7SAM error on line 1750: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [******************* ] 7SAM error on line 1820: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [******************* ] 7SAM error on line 1830: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [******************** ] 8SAM error on line 1840: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [********************* ] 8SAM error on line 1872: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [********************* ] 8SAM error on line 1878: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [********************* ] 8SAM error on line 1881: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [********************* ] 8SAM error on line 1883: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [********************** ] 8SAM error on line 1907: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [********************** ] 9SAM error on line 1918: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [********************** ] 9SAM error on line 1940: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [*********************** ] 9SAM error on line 1946: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [*********************** ] 9SAM error on line 1985: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [************************ ] 9SAM error on line 2000: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [************************ ] 9SAM error on line 2002: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [************************ ] 9SAM error on line 2005: CIGAR op has zero length
        > Processed 392 loci. [*************************] 100%
        Last edited by repinementer; 11-09-2010, 06:58 PM.

        Comment


        • #5
          Hi all,
          I was facing the same problem when I tried to run Cufflinks. But when you provide your SAM file with the header then you will not experience this error. It worked for my data but I am not sure about other datasets. But may be worth trying this.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM
          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          31 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          32 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          28 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          53 views
          0 likes
          Last Post seqadmin  
          Working...
          X