Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Q. Cufflinks: sort order of reads in BAMs must be the same

    After running tophat, I sorted "accepted_hit.bam", resulting in "accepted_hits_sorted.bam", and also transformed "Mh.gff to "Mh.gtf" by using gffread utility.

    However, if I run the commands below, error message keeps appearing. Would you please give me some tips about fixing this error?

    ~/cufflinks-2.0.2.Linux_x86_64/cufflinks -g Mh.gtf -o cufflinks_out_IR_t1_Mh IR_t1_Mh/accepted_hits_sorted.bam

    ~/cufflinks-2.0.2.Linux_x86_64/cufflinks -g Mh.gff -o cufflinks_out_IR_t1_Mh IR_t1_Mh/accepted_hits_sorted.bam

    You are using Cufflinks v2.0.2, which is the most recent release.
    Error: sort order of reads in BAMs must be the same

  • #2
    See here

    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


    Bottom line: Check you are using the most up-to-date version of Tophat (2)

    Comment


    • #3
      Originally posted by TonyBrooks View Post
      See here

      Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


      Bottom line: Check you are using the most up-to-date version of Tophat (2)
      I used tophat2.
      The command that I used is the following:

      tophat2 -p 6 -o IR_t1_Mh -G Mh.gff Mh read1.fq,read2.fq,read3.fq,read4,fq,read5.fq 2>> tophat.log &

      Comment


      • #4
        Maybe there's a problem with aligning multiple fastq's in the same bam file. I'm not sure how samtools would sort that file which then may cause problems for tophat.
        Maybe you could align one fastq at a time (assuming these are single, not paired reads). Then sort and run cufflinks separately too.

        Comment


        • #5
          Originally posted by TonyBrooks View Post
          Maybe there's a problem with aligning multiple fastq's in the same bam file. I'm not sure how samtools would sort that file which then may cause problems for tophat.
          Maybe you could align one fastq at a time (assuming these are single, not paired reads). Then sort and run cufflinks separately too.
          All of the reads in "read1.fq, read2.fq, ..., read5.fq" were taken from the same tissue. The only difference across these reads is the lane on which they were loaded. So I need to put them together in one command line. If I do this, doesn't the following step of cufflinks work on this tophat output?

          If the command above is wrong, then in which cases multiple reads are put together just separated by comma in tophat command, and if cufflinks does not work for this case, how could they be analyzed not by using cufflinks but by using alternative approach which would have the same function as cufflinks?

          Comment


          • #6
            Originally posted by TonyBrooks View Post
            Maybe there's a problem with aligning multiple fastq's in the same bam file. I'm not sure how samtools would sort that file which then may cause problems for tophat.
            Maybe you could align one fastq at a time (assuming these are single, not paired reads). Then sort and run cufflinks separately too.
            When I ran the command below,
            samtools tview exercise/F/F/IR_t1_Mh/accepted_hits_sorted.bam exercise/F/F/Mh.fa | more

            I got a result which has been loaded as attachment.

            Could you see what the problem is?
            May the loaded bam file be generally expected one?
            Attached Files

            Comment


            • #7
              Originally posted by TonyBrooks View Post
              Maybe there's a problem with aligning multiple fastq's in the same bam file. I'm not sure how samtools would sort that file which then may cause problems for tophat.
              Maybe you could align one fastq at a time (assuming these are single, not paired reads). Then sort and run cufflinks separately too.


              When I ran the below command,
              samtools view -h accepted_hits_sorted.bam | more
              I got the result which has been loaded as attachment.

              Could you see what the problem is?
              Thanks in advance.
              Attached Files

              Comment


              • #8
                I'm not really a bioinformatician, so I'm not 100% sure. I think using a comma separated list treats each fastq as being from a different source (from a quick check of the tophat manual).
                Can you not just cat the files together before running them?

                Comment


                • #9
                  Originally posted by syintel87 View Post
                  After running tophat, I sorted "accepted_hit.bam", resulting in "accepted_hits_sorted.bam", and also transformed "Mh.gff to "Mh.gtf" by using gffread utility.

                  However, if I run the commands below, error message keeps appearing. Would you please give me some tips about fixing this error?

                  ~/cufflinks-2.0.2.Linux_x86_64/cufflinks -g Mh.gtf -o cufflinks_out_IR_t1_Mh IR_t1_Mh/accepted_hits_sorted.bam

                  ~/cufflinks-2.0.2.Linux_x86_64/cufflinks -g Mh.gff -o cufflinks_out_IR_t1_Mh IR_t1_Mh/accepted_hits_sorted.bam

                  You are using Cufflinks v2.0.2, which is the most recent release.
                  Error: sort order of reads in BAMs must be the same
                  If you sort your GFF files in the same order as the BAM files you should be able to get it to work. Alternatively drop the GFF files.

                  Comment


                  • #10
                    Sounds to me like Siva's probably got the right idea, but it's also worth pointing out that tophat will, by default, sort the reads in accepted_hits. You can turn this off, but if you're just going to sort them anyways, it doesn't seem to make sense.

                    Comment


                    • #11
                      Originally posted by rflrob View Post
                      Sounds to me like Siva's probably got the right idea, but it's also worth pointing out that tophat will, by default, sort the reads in accepted_hits. You can turn this off, but if you're just going to sort them anyways, it doesn't seem to make sense.
                      Actually I thought I had the right idea....but I just realized that the "sort order of reads in BAM" refers to the aligned reads vs the headers. The aligned reads and the headers in the BAM file must be sorted in the exact same way. I think that should solve the problem. It doesn't have anything to do with the GFF file.

                      Comment


                      • #12
                        When I tried running cufflinks, it said,
                        "this SAM file doesn't appear to be correctly sorted!
                        current hit is at Mh:0004:MhA1_Contig4:380, last one was at Mh:0003:MhA1_Contig3:890....................."

                        So I checked GFF and SAM file.
                        In the sam file, there were Contigs starting from 3, while in gff file there were no Contig3 at all. That is, only Contig2 and Contig 4 existed in gff file.

                        I guess incomplete gff file made an error in running cufflinks, though previously there were no problems in running tophat with the same gff file.
                        Would my guess be right, so that what I need to do is making a new gff file based on sam/bam files? Does cufflinks require Contigs(annotation) have to be the same between gff file and sam/bam file?

                        Thanks.
                        Last edited by syintel87; 01-17-2013, 01:25 PM.

                        Comment


                        • #13
                          Originally posted by syintel87 View Post
                          I guess incomplete gff file made an error in running cufflinks, though previously there were no problems in running tophat with the same gff file.
                          Would my guess be right, so that what I need to do is making a new gff file based on sam/bam files? Does cufflinks require Contigs(annotation) have to be the same between gff file and sam/bam file?

                          Thanks.
                          Do keep us updated. Actually in my case the BAM file (appears to be correctly sorted) if you look at the headers. I did away with -G option for cufflinks (did not supply GFF file). I still get the same error "sort order of reads in BAMs must be same"

                          Comment


                          • #14
                            I have run cufflinks without -G option and got four output files:
                            -genes.fpkm_tracking,
                            -isoforms.fpkkm_tracking,
                            -skipped.gtf,
                            -and transcripts.gtf.

                            It seems that cufflinks works only when gff file being dropped, in my case.
                            Last edited by syintel87; 01-17-2013, 02:55 PM.

                            Comment


                            • #15
                              By default, the accepted_hits.bam file is co-ordinate sorted and does not need any further sorting to run cufflinks on it. Have you tried the accepted_hits.bam 'as is' generated by Tophat?

                              Also, would suggest you use the GTF from igenomes, if you are mapping to one of the 10 genomes available here:

                              http://tophat.cbcb.umd.edu/igenomes.html.

                              Converting GFF to GTF is not necessary as you can use the 'genes.gtf' available within the igenomes directory structure of your genome of interest.

                              Also, highly advised to run tophat with a bowtie index that is built on the same genome, that you are providing the GTF file for, in the cufflinks run.

                              Thanks
                              Parthav


                              Originally posted by syintel87 View Post
                              After running tophat, I sorted "accepted_hit.bam", resulting in "accepted_hits_sorted.bam", and also transformed "Mh.gff to "Mh.gtf" by using gffread utility.

                              However, if I run the commands below, error message keeps appearing. Would you please give me some tips about fixing this error?

                              ~/cufflinks-2.0.2.Linux_x86_64/cufflinks -g Mh.gtf -o cufflinks_out_IR_t1_Mh IR_t1_Mh/accepted_hits_sorted.bam

                              ~/cufflinks-2.0.2.Linux_x86_64/cufflinks -g Mh.gff -o cufflinks_out_IR_t1_Mh IR_t1_Mh/accepted_hits_sorted.bam

                              You are using Cufflinks v2.0.2, which is the most recent release.
                              Error: sort order of reads in BAMs must be the same

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Advancing Precision Medicine for Rare Diseases in Children
                                by seqadmin




                                Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                                12-16-2024, 07:57 AM
                              • seqadmin
                                Recent Advances in Sequencing Technologies
                                by seqadmin



                                Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                                Long-Read Sequencing
                                Long-read sequencing has seen remarkable advancements,...
                                12-02-2024, 01:49 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 12-17-2024, 10:28 AM
                              0 responses
                              33 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 12-13-2024, 08:24 AM
                              0 responses
                              49 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 12-12-2024, 07:41 AM
                              0 responses
                              34 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 12-11-2024, 07:45 AM
                              0 responses
                              46 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X