Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Q. Cufflinks: sort order of reads in BAMs must be the same

    After running tophat, I sorted "accepted_hit.bam", resulting in "accepted_hits_sorted.bam", and also transformed "Mh.gff to "Mh.gtf" by using gffread utility.

    However, if I run the commands below, error message keeps appearing. Would you please give me some tips about fixing this error?

    ~/cufflinks-2.0.2.Linux_x86_64/cufflinks -g Mh.gtf -o cufflinks_out_IR_t1_Mh IR_t1_Mh/accepted_hits_sorted.bam

    ~/cufflinks-2.0.2.Linux_x86_64/cufflinks -g Mh.gff -o cufflinks_out_IR_t1_Mh IR_t1_Mh/accepted_hits_sorted.bam

    You are using Cufflinks v2.0.2, which is the most recent release.
    Error: sort order of reads in BAMs must be the same

  • #2
    See here

    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


    Bottom line: Check you are using the most up-to-date version of Tophat (2)

    Comment


    • #3
      Originally posted by TonyBrooks View Post
      See here

      Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


      Bottom line: Check you are using the most up-to-date version of Tophat (2)
      I used tophat2.
      The command that I used is the following:

      tophat2 -p 6 -o IR_t1_Mh -G Mh.gff Mh read1.fq,read2.fq,read3.fq,read4,fq,read5.fq 2>> tophat.log &

      Comment


      • #4
        Maybe there's a problem with aligning multiple fastq's in the same bam file. I'm not sure how samtools would sort that file which then may cause problems for tophat.
        Maybe you could align one fastq at a time (assuming these are single, not paired reads). Then sort and run cufflinks separately too.

        Comment


        • #5
          Originally posted by TonyBrooks View Post
          Maybe there's a problem with aligning multiple fastq's in the same bam file. I'm not sure how samtools would sort that file which then may cause problems for tophat.
          Maybe you could align one fastq at a time (assuming these are single, not paired reads). Then sort and run cufflinks separately too.
          All of the reads in "read1.fq, read2.fq, ..., read5.fq" were taken from the same tissue. The only difference across these reads is the lane on which they were loaded. So I need to put them together in one command line. If I do this, doesn't the following step of cufflinks work on this tophat output?

          If the command above is wrong, then in which cases multiple reads are put together just separated by comma in tophat command, and if cufflinks does not work for this case, how could they be analyzed not by using cufflinks but by using alternative approach which would have the same function as cufflinks?

          Comment


          • #6
            Originally posted by TonyBrooks View Post
            Maybe there's a problem with aligning multiple fastq's in the same bam file. I'm not sure how samtools would sort that file which then may cause problems for tophat.
            Maybe you could align one fastq at a time (assuming these are single, not paired reads). Then sort and run cufflinks separately too.
            When I ran the command below,
            samtools tview exercise/F/F/IR_t1_Mh/accepted_hits_sorted.bam exercise/F/F/Mh.fa | more

            I got a result which has been loaded as attachment.

            Could you see what the problem is?
            May the loaded bam file be generally expected one?
            Attached Files

            Comment


            • #7
              Originally posted by TonyBrooks View Post
              Maybe there's a problem with aligning multiple fastq's in the same bam file. I'm not sure how samtools would sort that file which then may cause problems for tophat.
              Maybe you could align one fastq at a time (assuming these are single, not paired reads). Then sort and run cufflinks separately too.


              When I ran the below command,
              samtools view -h accepted_hits_sorted.bam | more
              I got the result which has been loaded as attachment.

              Could you see what the problem is?
              Thanks in advance.
              Attached Files

              Comment


              • #8
                I'm not really a bioinformatician, so I'm not 100% sure. I think using a comma separated list treats each fastq as being from a different source (from a quick check of the tophat manual).
                Can you not just cat the files together before running them?

                Comment


                • #9
                  Originally posted by syintel87 View Post
                  After running tophat, I sorted "accepted_hit.bam", resulting in "accepted_hits_sorted.bam", and also transformed "Mh.gff to "Mh.gtf" by using gffread utility.

                  However, if I run the commands below, error message keeps appearing. Would you please give me some tips about fixing this error?

                  ~/cufflinks-2.0.2.Linux_x86_64/cufflinks -g Mh.gtf -o cufflinks_out_IR_t1_Mh IR_t1_Mh/accepted_hits_sorted.bam

                  ~/cufflinks-2.0.2.Linux_x86_64/cufflinks -g Mh.gff -o cufflinks_out_IR_t1_Mh IR_t1_Mh/accepted_hits_sorted.bam

                  You are using Cufflinks v2.0.2, which is the most recent release.
                  Error: sort order of reads in BAMs must be the same
                  If you sort your GFF files in the same order as the BAM files you should be able to get it to work. Alternatively drop the GFF files.

                  Comment


                  • #10
                    Sounds to me like Siva's probably got the right idea, but it's also worth pointing out that tophat will, by default, sort the reads in accepted_hits. You can turn this off, but if you're just going to sort them anyways, it doesn't seem to make sense.

                    Comment


                    • #11
                      Originally posted by rflrob View Post
                      Sounds to me like Siva's probably got the right idea, but it's also worth pointing out that tophat will, by default, sort the reads in accepted_hits. You can turn this off, but if you're just going to sort them anyways, it doesn't seem to make sense.
                      Actually I thought I had the right idea....but I just realized that the "sort order of reads in BAM" refers to the aligned reads vs the headers. The aligned reads and the headers in the BAM file must be sorted in the exact same way. I think that should solve the problem. It doesn't have anything to do with the GFF file.

                      Comment


                      • #12
                        When I tried running cufflinks, it said,
                        "this SAM file doesn't appear to be correctly sorted!
                        current hit is at Mh:0004:MhA1_Contig4:380, last one was at Mh:0003:MhA1_Contig3:890....................."

                        So I checked GFF and SAM file.
                        In the sam file, there were Contigs starting from 3, while in gff file there were no Contig3 at all. That is, only Contig2 and Contig 4 existed in gff file.

                        I guess incomplete gff file made an error in running cufflinks, though previously there were no problems in running tophat with the same gff file.
                        Would my guess be right, so that what I need to do is making a new gff file based on sam/bam files? Does cufflinks require Contigs(annotation) have to be the same between gff file and sam/bam file?

                        Thanks.
                        Last edited by syintel87; 01-17-2013, 01:25 PM.

                        Comment


                        • #13
                          Originally posted by syintel87 View Post
                          I guess incomplete gff file made an error in running cufflinks, though previously there were no problems in running tophat with the same gff file.
                          Would my guess be right, so that what I need to do is making a new gff file based on sam/bam files? Does cufflinks require Contigs(annotation) have to be the same between gff file and sam/bam file?

                          Thanks.
                          Do keep us updated. Actually in my case the BAM file (appears to be correctly sorted) if you look at the headers. I did away with -G option for cufflinks (did not supply GFF file). I still get the same error "sort order of reads in BAMs must be same"

                          Comment


                          • #14
                            I have run cufflinks without -G option and got four output files:
                            -genes.fpkm_tracking,
                            -isoforms.fpkkm_tracking,
                            -skipped.gtf,
                            -and transcripts.gtf.

                            It seems that cufflinks works only when gff file being dropped, in my case.
                            Last edited by syintel87; 01-17-2013, 02:55 PM.

                            Comment


                            • #15
                              By default, the accepted_hits.bam file is co-ordinate sorted and does not need any further sorting to run cufflinks on it. Have you tried the accepted_hits.bam 'as is' generated by Tophat?

                              Also, would suggest you use the GTF from igenomes, if you are mapping to one of the 10 genomes available here:

                              http://tophat.cbcb.umd.edu/igenomes.html.

                              Converting GFF to GTF is not necessary as you can use the 'genes.gtf' available within the igenomes directory structure of your genome of interest.

                              Also, highly advised to run tophat with a bowtie index that is built on the same genome, that you are providing the GTF file for, in the cufflinks run.

                              Thanks
                              Parthav


                              Originally posted by syintel87 View Post
                              After running tophat, I sorted "accepted_hit.bam", resulting in "accepted_hits_sorted.bam", and also transformed "Mh.gff to "Mh.gtf" by using gffread utility.

                              However, if I run the commands below, error message keeps appearing. Would you please give me some tips about fixing this error?

                              ~/cufflinks-2.0.2.Linux_x86_64/cufflinks -g Mh.gtf -o cufflinks_out_IR_t1_Mh IR_t1_Mh/accepted_hits_sorted.bam

                              ~/cufflinks-2.0.2.Linux_x86_64/cufflinks -g Mh.gff -o cufflinks_out_IR_t1_Mh IR_t1_Mh/accepted_hits_sorted.bam

                              You are using Cufflinks v2.0.2, which is the most recent release.
                              Error: sort order of reads in BAMs must be the same

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM
                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 03-27-2024, 06:37 PM
                              0 responses
                              13 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-27-2024, 06:07 PM
                              0 responses
                              11 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-22-2024, 10:03 AM
                              0 responses
                              53 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-21-2024, 07:32 AM
                              0 responses
                              69 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X