Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • visualisation of RNAseq (kallisto to IGV)

    Hello everyone,

    as others, I am quite excited about pseudo alignment produced by kallisto in minutes instead of real alignment computed for hours. Now, it would be useful to visualise it using IGV.

    So from the .gdb file we extracted cds of our bacteria using python scripts. The name of each sequence in cds was the gene_id (which was the same as transcript_id). Exactly, how we would expect.

    On this cds file I run kallisto index to index it and then I produced according to the manual of kallisto pseudobam file. (https://pachterlab.github.io/kallisto/manual.html)

    kallisto quant -i cds.idx -o output -b 100 --single -l 100 -s 1 --pseudobam <all_RNAseq_reads.fq.gz> | samtools view -Sb - > pseudomap.bam

    The .bam file was then sorted and indexed and loaded with .fasta and .gtf file to IGV giving following error

    File does not contain any sequence names which match the current genome.
    File: *****S5_genome_87, S5_genome_88, S5_genome_89, S5_genome_90, ...
    Genome: S5_genome,

    S5_genome_XX are gene_ids of our genome and S5 is our genome. So, I thought, that IGV thinks, that every transcript is a chromosome (from few related posts like http://seqanswers.com/forums/archive...p/t-16407.html). So I ve created alias file like this:

    S5_genome_87 S5_genome
    S5_genome_88 S5_genome
    ... ...

    Now it loaded the file, but reads are not visualised at all. I guess I miss something somewhere. Imho the easiest way would be to edit somehow the .bam file (or the .sam file before it is converted to .bam) to include the information of the only one chromosome of the genome.

    If you are still reading, thank you for it. Any help appreciated.

  • #2
    I wrote a small python script for conversion of .sam produced by kallisto to .sam readable by IGV using .gtf file. It is not perfect (I was bit in rush when I was writing it) and all transctipts on reverse reverse strand have reads viewed as they would be in forward direction (so opposite than they should), but on the correct place (i.e. if you want to check coverage / transcripts, it is fair enough).

    So if you would be interested



    Usage:

    python3 kallisto_sam_convertor.py <pseudoalignment.sam> <annotation.gtf> | samtools view -bS - | samtools sort - -o <output.bam>

    bam should be loadable to IGV.

    ---edit---
    I think, that to correct the script, it's needed to change a bitflag of reads mapping to transcripts from reverse strand (fw reads - to bw reads and visa reverse) and recompute position of the read (should be symmetric around the middle of a transcript.)
    Last edited by KamilSJaron; 12-01-2016, 11:35 AM. Reason: correction of the specification of the problem, the script in post have.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Strategies for Sequencing Challenging Samples
      by seqadmin


      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
      03-22-2024, 06:39 AM
    • seqadmin
      Techniques and Challenges in Conservation Genomics
      by seqadmin



      The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

      Avian Conservation
      Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
      03-08-2024, 10:41 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, Yesterday, 06:37 PM
    0 responses
    10 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, Yesterday, 06:07 PM
    0 responses
    9 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-22-2024, 10:03 AM
    0 responses
    51 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-21-2024, 07:32 AM
    0 responses
    67 views
    0 likes
    Last Post seqadmin  
    Working...
    X