Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problems using TopHat output in SAMtools

    hi there,

    I have run some RNA seq reads through TopHat and generated the 3 standard output files: accepted_hits.sam
    coverage.wig
    junctions.bed

    Both the .wig and .bed files produce error messages when I tried to import them to UCSC browser - Have other people had problems with TopHat output in UCSC?

    I also tried to view accepted_hits.sam using SAMtools, but had a few problems here.

    I followed the following steps:
    samtools faidx ex1.fa # index the reference FASTA (no problems)

    samtools import ex1.fa.fai accepted_hits.sam. accepted_hits.bam # SAM->BAM

    here i got the following error message (repeated multiple times)

    [sam_read1] reference 'gi|17981852|ref|NC_001807.4|' is recognized as '*'.

    [sam_read1] reference 'gi|17981852|ref|NC_001807.4|' is recognized as '*'.


    a bam file was still produced, and so i tried using it anyway:

    samtools index accepted_hits.bam # index BAM

    samtools tview ex1.bam ex1.fa # view alignment

    -this just gives me a viewer with a blank screen

    samtools pileup -t ref.fa.fai file.sam (all the other options i tried with pileup didnt give any output)

    -gives the same error messge i had when i ran 'import'
    [sam_read1] reference 'gi|17981852|ref|NC_001807.4|' is recognized as '*'.



    Here is the top of my .sam file (there is no header it seems) - it all seems to be in order according to SAM file specifications.

    GAPC:1:38:1386:252#0 16 gi|13626247|ref|NT_025975.2|HsY_26131 87997 255 35M * 0 0 TCAATTCCTTGCGATTCCATTACATTCGATTTCTT ]_aaa]VV]a_\`aaa`abaa`^aaabaaabaaaa NM:i:1
    GAPC:1:16:1046:505#0 0 gi|14772189|ref|NT_025215.4|Hs20_25371 27969 3 35M * 0 0 CACACCCAATATTATAACAAAAGATTGTAACAAGG ababbba`abbbbbbbbbbbaaMabaOabbbbb[T NM:i:2

    Has anyone else had these problems? Any clues!!! Ive been struggling with this for days now

    Thanks

  • #2
    Hi nat,

    This might answer your question: http://seqanswers.com/forums/showthread.php?t=2183

    The chromosome field is referring to contigs instead of chromosomes, which the UCSC Browser doesn't understand.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Recent Advances in Sequencing Analysis Tools
      by seqadmin


      The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
      05-06-2024, 07:48 AM
    • seqadmin
      Essential Discoveries and Tools in Epitranscriptomics
      by seqadmin




      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
      04-22-2024, 07:01 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 05-10-2024, 06:35 AM
    0 responses
    20 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 05-09-2024, 02:46 PM
    0 responses
    26 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 05-07-2024, 06:57 AM
    0 responses
    21 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 05-06-2024, 07:17 AM
    0 responses
    21 views
    0 likes
    Last Post seqadmin  
    Working...
    X