Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • HTSeq version in-consistency and the STRANDED option

    Hi all,

    I have used tophat2 to map human RNASeq reads to UCSC Hg19.
    The result hereof is naturally a BAM-file. I now want to count reads mapped to genes and for this purpose I have chosen HTSeq.
    I have downloaded and installed HTSeq-0.6.1, but when I type
    Code:
    htseq-count -h
    It says
    Code:
    Public License v3. Part of the 'HTSeq' framework, version 0.5.4p5.
    Is something wrong with the installation?

    And also, what exactly does the '-s' option do?
    Code:
      -s STRANDED, --stranded=STRANDED
                            whether the data is from a strand-specific assay.
                            Specify 'yes', 'no', or 'reverse' (default: yes).
                            'reverse' means 'yes' with reversed strand
                            interpretation
    What does 'strand-specific assay' mean and should I use the '-s' option, when counting human RNASeq reads mapped to Hg19?

    Cheers,
    Leon

  • #2
    Presumably it's finding the older version before the newer one. Just remove the old version.

    Regarding strandedness, see here. For context, strand-specific is the same concept as directional. If you don't know what a strand is then ask a biologist for a primer on the central dogma of molecular biology.

    Comment


    • #3
      Originally posted by dpryan View Post
      Presumably it's finding the older version before the newer one. Just remove the old version.

      Regarding strandedness, see here. For context, strand-specific is the same concept as directional. If you don't know what a strand is then ask a biologist for a primer on the central dogma of molecular biology.
      Hi dpryan,

      I am sure, that I am calling the htseq-count I actually installed...

      With respect to the strand-specific assaying - I am a newbie to NGS, hence my profile and what may seem "simple" questions... In NGS context, I am not sure I understand exactly what it means, that a sequencing protocol is strand-specific?

      Cheers,
      Leon

      Comment


      • #4
        Yes, RNA-Seq can be strand-specific.
        In fact, at our molecular biology platform, we now only do strand-specific RNA-Seq, since it eliminates the ambiguity when there are different genes on opposite strands.
        There are several protocols to prepare strand-specific RNA-Seq libraries. We use the dUTP protocol.

        Regarding your problem with the htseq-count version, here are a 2 suggestions.

        First, the correct htseq-count version will appear if it is properly installed.
        Code:
        [blancha@lg-1r17-n03 deduplicatebismark]$ htseq-count -h
        Usage: htseq-count [options] alignment_file gff_file
        
        This script takes an alignment file in SAM/BAM format and a feature file in
        GFF format and calculates for each feature the number of reads mapping to it.
        See http://www-huber.embl.de/users/anders/HTSeq/doc/count.html for details.
        
        Options:
          -h, --help            show this help message and exit
          -f SAMTYPE, --format=SAMTYPE
                                type of <alignment_file> data, either 'sam' or 'bam'
                                (default: sam)
          -r ORDER, --order=ORDER
                                'pos' or 'name'. Sorting order of <alignment_file>
                                (default: name). Paired-end sequencing data must be
                                sorted either by position or by read name, and the
                                sorting order must be specified. Ignored for single-
                                end data.
          -s STRANDED, --stranded=STRANDED
                                whether the data is from a strand-specific assay.
                                Specify 'yes', 'no', or 'reverse' (default: yes).
                                'reverse' means 'yes' with reversed strand
                                interpretation
          -a MINAQUAL, --minaqual=MINAQUAL
                                skip all reads with alignment quality lower than the
                                given minimum value (default: 10)
          -t FEATURETYPE, --type=FEATURETYPE
                                feature type (3rd column in GFF file) to be used, all
                                features of other type are ignored (default, suitable
                                for Ensembl GTF files: exon)
          -i IDATTR, --idattr=IDATTR
                                GFF attribute to be used as feature ID (default,
                                suitable for Ensembl GTF files: gene_id)
          -m MODE, --mode=MODE  mode to handle reads overlapping more than one feature
                                (choices: union, intersection-strict, intersection-
                                nonempty; default: union)
          -o SAMOUT, --samout=SAMOUT
                                write out all SAM alignment records into an output SAM
                                file called SAMOUT, annotating each line with its
                                feature assignment (as an optional field with tag
                                'XF')
          -q, --quiet           suppress progress report
        
        Written by Simon Anders ([email protected]), European Molecular Biology
        Laboratory (EMBL). (c) 2010. Released under the terms of the GNU General
        Public License v3. Part of the 'HTSeq' framework, version 0.6.1p1.
        Second, you should verify if you are calling the version you just installed

        Code:
        which htseq-count
        Finally, and this is a common problem, verify that you have changed the PYTHONPATH variable to point to the updated HTSeq library.
        Code:
        echo $PYTHONPATH

        Comment


        • #5
          Ok, so found out, that the RNASeq data I have is stranded / strand-specific, so that's all good...

          Reg. HTSeq it seemed that the $PYTHONPATH needed updating

          Cheers,
          Leon

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM
          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          26 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          29 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          25 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          52 views
          0 likes
          Last Post seqadmin  
          Working...
          X