Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tophat mapping problem on reads with length=50nt(or 51nt)

    Hi,

    I used tophat version 2.0.10 to map a fastq file with reads (=51nt) to human reference genome with the following command:
    tophat -p 4 --library-type fr-firststrand --GTF genes.gtf -o output_tophat --no-novel-juncs ./genome_references/bowtie2_hg19/hg19 h11648.fq[fastq]

    It failed after running through building the bowtie index, the log is as follows:
    [2015-10-14 13:36:10] Beginning TopHat run (v2.0.10)
    -----------------------------------------------
    [2015-10-14 13:36:10] Checking for Bowtie
    Bowtie version: 2.1.0.0
    [2015-10-14 13:36:10] Checking for Samtools
    Samtools version: 0.1.18.0
    [2015-10-14 13:36:10] Checking for Bowtie index files (genome)..
    [2015-10-14 13:36:10] Checking for reference FASTA file
    [2015-10-14 13:36:10] Generating SAM header for /data/Mullen_1/Data/genome_references/bowtie2_hg19/hg19
    [2015-10-14 13:36:23] Reading known junctions from GTF file
    [2015-10-14 13:36:27] Preparing reads
    left reads: min. length=51, max. length=51, 2912 kept reads (0 discarded)
    [2015-10-14 13:36:27] Building transcriptome data files h11648_tophat/tmp/H148hrlncRNA_UCSCgenes
    [2015-10-14 13:36:47] Building Bowtie index from H148hrlncRNA_UCSCgenes.fa
    [2015-10-14 13:48:08] Mapping left_kept_reads to transcriptome genes with Bowtie2
    /apps/source/tophat/2.0.10/bam2fastx: /lib64/libz.so.1: no version information available (required by /apps/source/tophat/2.0.10/bam2fastx)
    /apps/source/tophat/2.0.10/fix_map_ordering: /lib64/libz.so.1: no version information available (required by /apps/source/tophat/2.0.10/fix_map_ordering)
    [FAILED]
    Error running:
    /apps/source/tophat/2.0.10/bam2fastx --all h11648_tophat/tmp/left_kept_reads.bam|/apps/source/bowtie2-2.1.0/bowtie2-2.1.0/bowtie2-align -k 60 -D 15 -R 2 -N 0 -L 20 -i S,1,1.25 --gbar 4 --mp 6,2 --np 1 --rdg 5,3 --rfg 5,3 --score-min C,-14,0 -p 4 --sam-no-hd -x h11648_tophat/tmp/genes -|/apps/source/tophat/2.0.10/fix_map_ordering --bowtie2-min-score 15 --read-mismatches 2 --read-gap-length 2 --read-edit-dist 2 --read-realign-edit-dist 3 --sam-header h11648_tophat/tmp/H148hrlncRNA_UCSCgenes.bwt.samheader.sam - - h11648_tophat/tmp/left_kept_reads.m2g_um.bam | /apps/source/tophat/2.0.10/map2gtf --sam-header h11648_tophat/tmp/hg19_genome.bwt.samheader.sam h11648_tophat/tmp/genes.fa.tlst - h11648_tophat/tmp/left_kept_reads.m2g.bam > h11648_tophat/logs/m2g_left_kept_reads.out



    I used the same parameters of tophat to run through single-end reads with 42 nt length, but cannot know what made the tophat run through single-end reads with 51nt. Even after trimming the reads into 50nt, it did not work. And it also failed on small dataset.

    Does anyone have any suggestions on solving this problem?

    Many thanks!

  • #2
    Two things. You are using an older version of tophat. Where possible you should always use the latest.

    You also appear to have a library version mismatch (/lib64/libz.so.1). What OS are you using?
    Last edited by GenoMax; 11-06-2015, 07:17 PM. Reason: correction

    Comment


    • #3
      Hi GenoMax,

      Thanks for your reply!
      I ran the Tophat on the OS: Red Hat Enterprise Linux Server release 6.5 (Santiago). Regarding to the library, if this is a problem, the Tophat ran through on other data smoothly.

      Thanks again!

      Comment


      • #4
        @tinkering: I have amended my post #2. It appears that the version of lib64/libz available on your system is dissimilar from the one tophat2 is expecting (I think you are using the pre-compiled binary? You can check by running "ldd -v tophat2" to see what libraries are linked). If you were to compile from source that message should go away since the dynamic linker will use the library available on your system.

        There are multiple commands separated by pipes in the error you include above. Have you tried to run them independently to see where the error is coming from? What is the exact error BTW?

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM
        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        18 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        22 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        16 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        47 views
        0 likes
        Last Post seqadmin  
        Working...
        X