Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • sbdk82
    Member
    • Jul 2014
    • 26

    HTSeq warning message

    I am using HTSeq-count to count paired end reads (2 samples with 3 replicates in each) . But I am getting the warning message which says "some reads with missing mates encountered". Am I doing something wrong? Following are the missing mates in all 6 reads.

    HTML Code:
    Warning: 29572652 reads with missing mate encountered.
    79486404 SAM alignment pairs processed.
    
    Warning: 29467379 reads with missing mate encountered.
    74028848 SAM alignment pairs processed.
    
    Warning: 41994492 reads with missing mate encountered.
    108334368 SAM alignment pairs processed.
    
    Warning: 31994985 reads with missing mate encountered.
    81980266 SAM alignment pairs processed.
    
    Warning: 145791964 reads with missing mate encountered.
    150324577 SAM alignment pairs processed.
    
    Warning: 128675855 reads with missing mate encountered.
    132292695 SAM alignment pairs processed.
    I am using the following command
    HTML Code:
    htseq-count -r name -m intersection-strict -s no -i gene_id  <sorted_by_name.sam> <GTF File>   >  <Count File>
  • dpryan
    Devon Ryan
    • Jul 2011
    • 3478

    #2
    Which aligner did you use? Odds are good that these are singletons.

    Comment

    • sbdk82
      Member
      • Jul 2014
      • 26

      #3
      I used BWA-MEM for alignment. Then used samtools to sort the SAM file using this

      HTML Code:
      samtools view -b -S File.SAM > File.BAM
      samtools sort -n File.BAM   File_sorted
      samtools view -h  File_sorted.bam > File_sorted.sam

      Comment

      • dpryan
        Devon Ryan
        • Jul 2011
        • 3478

        #4
        What happens if you:
        Code:
        samtools view -SF 8 File_sorted.sam | htseq-count -r name -m intersection-strict -s no -i gene_id  - <GTF File>   >  <Count File>
        If the warnings go away, then you know that this is due to singletons. With bwa mem, this could also be due to chimeric/fusion/non-linear alignments. I don't use bwa mem with RNAseq datasets, so I've not thought much about how such alignments might get treated by htseq-count.

        BTW, you can skip the sorting and conversion to/from BAM and just use the initial SAM (or BAM if you pipe bwa mem to samtools) file. You don't have to actually name-sort things, the aligner will output pairs together anyway and that's all that htseq-count wants.

        Comment

        • sbdk82
          Member
          • Jul 2014
          • 26

          #5
          Thanks !! I will try that.

          I used the original SAM as you suggested in an earlier post, but that gave me some error. So I tried with sorted sam. I am using BWA-MEM because tophat2 cannot handle the large files. Can you please suggest any other aligner for RNA-Seq reads?

          I am trying STAR aligner now, but getting segmentation fault while mapping the reads

          <HTML>./STAR --genomeDir /PATH/index/ --readFilesIn /Read_PATH/_L001_R1.fastq, /Read_PATH/_L002_R1.fastq /Read_PATH/_L001_R2.fastq, /ReadPATH/_L002_R2_001.fastq --runThreadN 10<HTML>

          Comment

          • dpryan
            Devon Ryan
            • Jul 2011
            • 3478

            #6
            I was about to suggest using STAR, it's what I use. Which version of STAR are you using? I know that some people have reported segfaults of some of the recent versions (btw, you should post that to the user forum, Alex Dobin is one of the few aligner authors who actually provides timely feedback), though I've not had any issues myself.

            Comment

            • sbdk82
              Member
              • Jul 2014
              • 26

              #7
              I am using the latest version(2.3.0.1) . Yes, I saw some other people are also getting seg fault. I will post that on google group. Anyway, Thanks a lot for your help

              Comment

              • dpryan
                Devon Ryan
                • Jul 2011
                • 3478

                #8
                It's actually up to 2.4.0c, but you'll have to get it from github now.

                Comment

                • sbdk82
                  Member
                  • Jul 2014
                  • 26

                  #9
                  I see. Thanks !!

                  Comment

                  Latest Articles

                  Collapse

                  • SEQadmin2
                    From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                    by SEQadmin2


                    Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                    The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                    ...
                    06-02-2026, 10:05 AM
                  • SEQadmin2
                    Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                    by SEQadmin2


                    With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                    Introduction

                    Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                    05-22-2026, 06:42 AM
                  • SEQadmin2
                    Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                    by SEQadmin2

                    Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                    Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                    05-06-2026, 09:04 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by SEQadmin2, Yesterday, 08:59 AM
                  0 responses
                  11 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-02-2026, 12:03 PM
                  0 responses
                  21 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-02-2026, 11:40 AM
                  0 responses
                  17 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 05-28-2026, 11:40 AM
                  0 responses
                  31 views
                  0 reactions
                  Last Post SEQadmin2  
                  Working...