Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cufflinks problem sorting

    I have mapped reads to a reference using tophat/bowtie. I then use the output from tophat with cufflinks... and I get the following error "sort order of reads in BAMs must be the same."

    I have tried sorting my tophat output with samtools... but that still doesn't work.

    I am confused because I am only using 1 BAM file... so I do not understand what it is comparing my BAM file to that it thinks the sort orders are different.

    Anyone know about this? I have seen the error posted in different threads but those are dealing with cuffmerge or cuffdiff which would be joining or comparing multiple BAM files.

  • #2
    Use picards ReorderSam.jar if you believe sort order is the issue.

    Comment


    • #3
      Below is an example from my accepted_hits.bam file I am trying to run with cufflinks

      Code:
      HWI-1KL118:19:C099JACXX:8:2303:8200:82055	99	comp10000_c1_seq1	29	3	48M1D52M	=	88	159	ATCTTACCCTGTCCCCACTCTCTAAGAAGAAGAGCTATTTTTCTATGCTGCAAAATATTTCAATATTAATCAGGAAAAAAACAGGAACAGAACAAAAACA	IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII	NM:i:1	NH:i:2	CC:Z:comp9978_c0_seq1	CP:i:1751	HI:i:0
      HWI-1KL118:19:C099JACXX:8:1104:11427:111619	419	comp10000_c1_seq1	31	3	100M	=	64	133	CTTACCCTGTCCCCACTCTCTAAGAAGAAGAGCTATTTTTCTATGCTTGCAAAATATTTCAATATTAATCAGGAAAAAAACAGGAACAGAACAAAAACAA	IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII	NM:i:0	NH:i:2	CC:Z:comp9978_c0_seq1	CP:i:1750	HI:i:0
      HWI-1KL118:19:C099JACXX:8:2204:11486:188570	99	comp10000_c1_seq1	31	3	100M	=	64	133	CTTNCCCTGTCCCCACTCTCTAAGAAGAAGAGCTATTTTTCTATGCTTGCAAAATATTTCAATATTAATCAGGAAAAAAACAGGAACAGAACAAAAACAA	IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII	NM:i:1	NH:i:2	CC:Z:comp9978_c0_seq1	CP:i:1750	HI:i:0
      HWI-1KL118:19:C099JACXX:8:2308:14574:10360	419	comp10000_c1_seq1	36	3	100M	=	104	168	CCTGTCCCCACTCTCTAAGAAGAAGAGCTATTTTTCTATGCTTGCAAAATATTTCAATATTAATCAGGAAAAAAACAGGAACAGAACAAAAACAAAACAC	IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII	NM:i:0	NH:i:2	CC:Z:comp9978_c0_seq1	CP:i:1745	HI:i:0
      HWI-1KL118:19:C099JACXX:8:2105:2435:109977	355	comp10000_c1_seq1	43	3	100M	=	52	109	CCACTCTCTAAGAAGAAGAGCTATTTTTCTATGCTTGCAAAATATTTCAATATTAATCAGGAAAAAAACAGGAACAGAACAAAAACAAAACACAATAGAC	IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII	NM:i:0	NH:i:2	CC:Z:comp9978_c0_seq1	CP:i:1738	HI:i:0
      HWI-1KL118:19:C099JACXX:8:2302:5607:56272	137	comp10000_c1_seq1	43	3	100M	*	0	0	CCACTCTCTAAGAAGAAGAGCTATTTTTCTATGCTTGCAAAATATTTCAATATTAATCAGGAAAAAAACAGGAACAGAACAAAAACAAAACACAATAGAC	IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII	NM:i:0	NH:i:2	CC:Z:comp9978_c0_seq1	CP:i:1738	HI:i:0
      HWI-1KL118:19:C099JACXX:8:2307:6056:194919	99	comp10000_c1_seq1	43	3	100M	=	64	121	CCACTCTCTAAGAAGAAGAGCTATTTTTCTATGCTTGCAAAATATTTCAATATTAATCAGGAAAAAAACAGGAACAGAACAAAAACAAAACACAATAGAC	IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII	NM:i:0	NH:i:2	CC:Z:comp9978_c0_seq1	CP:i:1738	HI:i:0
      Does anyone know if there are multiple reference sequences with the exact same name if that would cause this error?

      EDIT: I found the problem... it was that I had multiple reference sequences with the same name. Removing the duplicates got rid of the error.
      Last edited by RNAddict; 05-24-2012, 04:41 AM.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM
      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      29 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      31 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      28 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      52 views
      0 likes
      Last Post seqadmin  
      Working...
      X