Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • delasa
    Member
    • Sep 2012
    • 12

    #31
    the dexseq_count.py is just yelling at me "claims to have an aligned mate which could not be found. (Is the SAM file properly sorted?)"! I sorted the bam files using "samtools sort" and then converted it to sam file using samtools view but Is still have these messages!

    Comment

    • Simon Anders
      Senior Member
      • Feb 2010
      • 995

      #32
      Have you checke your free space. Out of quota maybe? Or maybe 'sort' puts its temporary files to /tmp (see option -T), an at least on our big server, this one is on a small partition which is always full. Also, use '-S 100G' to tell 'sort' that you have lots of memory. BTW, why '-s -c'? We don't need a stable sort.

      Comment

      • Simon Anders
        Senior Member
        • Feb 2010
        • 995

        #33
        'samtools sort' sort by position. Use 'samtools sort -n' to sort by name.

        Comment

        • wangxj
          Junior Member
          • Sep 2012
          • 7

          #34
          I have the same question as below.

          Can anyone answer those questions?

          very appreciate

          Originally posted by glados View Post
          Dear all.

          I'm trying to find information about how HTSeq counts reads. I understood that one pair (properly paired) is counted as 1 count.
          What about pairs that are not flagged as 'properly paired'?
          What about the reads that lost their mate and became single reads?
          Are they counted as 1 count as well? Or not counted at all?

          Additionally I'm loosing quite many reads that have multiple mappings. Anyone figured out a way to deal with this in HTSeq, instead of just throwing them all out?

          Comment

          • dpryan
            Devon Ryan
            • Jul 2011
            • 3478

            #35
            A read that lost its mate will be counted once and a warning will be produced if the unmapped mate isn't actually in the file (tophat does this). I don't recall htseq-count caring about the properly paired flag.

            Regarding multimappers, you don't know with certainty where they align, so the proper solution for downstream analyses toward which htseq-count is oriented would be to discard them.
            Last edited by dpryan; 09-09-2013, 08:11 AM.

            Comment

            • jkbonfield
              Senior Member
              • Jul 2008
              • 146

              #36
              This task is a related one to the bamtofastq conversion in that collation by name is necessary, but not necessarily a full sort.

              Collation is often fast, while sorting is very slow. I don't know if there are dedicated collation tools out there (but I'd be suprised if there aren't).

              Comment

              • gt1
                Junior Member
                • Jul 2013
                • 9

                #37
                Originally posted by jkbonfield View Post
                This task is a related one to the bamtofastq conversion in that collation by name is necessary, but not necessarily a full sort.

                Collation is often fast, while sorting is very slow. I don't know if there are dedicated collation tools out there (but I'd be suprised if there aren't).
                Try bamcollate2 in biobambam (https://github.com/gt1/biobambam).

                Comment

                • Gonza
                  Member
                  • Mar 2013
                  • 78

                  #38
                  Dear All,

                  Why not sort them BAM files instead of the SAM? SAM takes to much space. After running tophat2, I am thinking something like this:

                  $samtools sort -n accepted_hits.bam output.bam

                  then count using htseq-count.

                  Thoughts????

                  Comment

                  • dpryan
                    Devon Ryan
                    • Jul 2011
                    • 3478

                    #39
                    There's no need to name sort anymore, htseq-count can handle coordinate sorted BAM files.

                    Comment

                    • jkbonfield
                      Senior Member
                      • Jul 2008
                      • 146

                      #40
                      Originally posted by Gonza View Post
                      Dear All,

                      Why not sort them BAM files instead of the SAM? SAM takes to much space. After running tophat2, I am thinking something like this:

                      $samtools sort -n accepted_hits.bam output.bam

                      then count using htseq-count.

                      Thoughts????
                      You will of course get output.bam.bam. Oh how I hate thee Samtools!

                      Comment

                      • zaki
                        Member
                        • Dec 2012
                        • 15

                        #41
                        Dear all,

                        Originally posted by dpryan View Post
                        There's no need to name sort anymore, htseq-count can handle coordinate sorted BAM files.
                        Looking at htseq version 0.6.0, the help doc mentioned

                        Code:
                         -r ORDER, --order=ORDER
                                                'pos' or 'name'. Sorting order of <alignment_file>
                                                (default: name). Paired-end sequencing data must be
                                                sorted either by [B]position[/B] or by read name, and the
                                                sorting order must be specified. Ignored for single-
                                                end data.
                        Please forgive this very naive question, does position sorted bam = coordinated sorted bam? Im guessing its yes, but a conformation would be reassuring.

                        Many thanks

                        Comment

                        • dpryan
                          Devon Ryan
                          • Jul 2011
                          • 3478

                          #42
                          Yes, position sorted is another name for coordinate sorted.

                          Comment

                          Latest Articles

                          Collapse

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by SEQadmin2, Today, 11:58 AM
                          0 responses
                          9 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-05-2026, 10:09 AM
                          0 responses
                          25 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-04-2026, 08:59 AM
                          0 responses
                          34 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-02-2026, 12:03 PM
                          0 responses
                          56 views
                          0 reactions
                          Last Post SEQadmin2  
                          Working...