![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
TopHat file.bam file.bed join | Trudy | Bioinformatics | 1 | 05-21-2013 11:59 AM |
tophat junctions.bed file | upendra_35 | RNA Sequencing | 2 | 10-23-2012 07:17 AM |
How to make sense of Tophat's output file 'junctions.bed' | gsinghal | RNA Sequencing | 4 | 09-03-2012 06:49 AM |
samtools sort bam file error: generate non-existent file | mediator | Bioinformatics | 0 | 03-05-2012 08:42 PM |
the tophat generate the bam file instead of sam files? | dingkai0564 | Bioinformatics | 1 | 11-10-2010 07:33 PM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Chicago Join Date: Dec 2011
Posts: 24
|
![]()
Dear all,
I have previously mapped RNA-seq samples to the human genome using Tophat 2.0.6. This generated an "accepted_hits.bam" result file, and a "junctions.bed" file I am interested in for details on exon-exon junctions mapping. To be sure to compare fairly my different samples, I want to downsample the BAM files to have the same number of mapped reads for each sample. This is pretty straighforward using samtools or picard. Now I would like to regenerate new "junctions.bed" files from the downsampled BAM files. I couldn't find in the log files which program of the tophat package does that. Anyone ever dealt with a similar issue? Any hint? Thanks for your help Julien |
![]() |
![]() |
![]() |
#2 |
Member
Location: Chicago Join Date: Dec 2011
Posts: 24
|
![]()
I am in the process of writing a perl script that generates the file junctions.bed from a given tophat alignment BAM file and I discovered a small bug that sometimes affects the definition of the flanking regions covered by mapped reads next to the exon-exon junction. When there is an insertion or a deletion on one side of the junction, the number of flanking covered nucleotides is not counted correctly.
For example in the following cigar line 14M1I90M664N2M, there are 14+90 nucleotides covered on the left of the junction, but tophat erroneously counts only 14. The same thing happens when there is a deletion. I thought this would be of interest for people working on related matters. Julien |
![]() |
![]() |
![]() |
#3 |
Member
Location: Japan Join Date: Oct 2010
Posts: 53
|
![]()
hi everyone
i need help in understanding the VCF file output GT: GQ : DP: FDP: AD: AST: AMQV 0/1: 13: 22: 13: 2,9: 2,6: 31,21 CHROM POS ID REF ALT QUAL FILTER INFO chr1 762273 . G A . PASS DP=22 Result: At this site GT is 0/1 (i guess it is heterozygous) the confidence is not good GQ=13 largely because there are only 22 reads at this site DP= 22, why AD is 2,9 what i understand is it should be equal to DP i read somewhere that this Sum can be smaller than DP because low-quality bases are not counted is it correct. how will this FDP, AST and AMQV be interpreted? Another question is how i can correlate the SNPse to specific diseases by comparing allelic frequencies |
![]() |
![]() |
![]() |
#4 |
Senior Member
Location: China Join Date: Feb 2009
Posts: 116
|
![]()
Hi Julien, very important information. Thank you.
I have the same needness, and find sam_juncs, component of tophat, can output the junction position information without depth and overhang. The needness of a script to get junctions.bed from bam file produced by tophat is, from my point: I want to subset the bam file by several different criteria and check the influence on the alternative splicing detection. This script will make the alignment unnecessary. Thank you. Last edited by pengchy; 07-09-2013 at 06:58 PM. Reason: add information |
![]() |
![]() |
![]() |
#5 |
Senior Member
Location: China Join Date: Feb 2009
Posts: 116
|
![]()
There is a bug for sam_juncs script, which is 1 smaller than the output of bed_to_juncs script.
here is the example: output of sam_juncs Code:
Scaffold184317 456 11046 + Scaffold184317 11239 21151 + Scaffold184317 21385 29565 + Scaffold184317 34619 60733 + Scaffold184317 61209 68778 + Scaffold184317 69066 75556 + Scaffold184317 75631 84372 + Scaffold184317 75631 102072 + Scaffold184317 104376 112269 - Scaffold184317 112465 115516 - Scaffold184317 115669 115888 - Scaffold184317 116034 116975 - Scaffold184317 117130 117392 - Scaffold184317 117631 123099 - Scaffold184317 123260 134557 - Code:
Scaffold184317 457 11046 + Scaffold184317 11240 21151 + Scaffold184317 21386 29565 + Scaffold184317 34620 60733 + Scaffold184317 61210 68778 + Scaffold184317 69067 75556 + Scaffold184317 75632 84372 + Scaffold184317 75632 102072 + Scaffold184317 104377 112269 - Scaffold184317 112466 115516 - Scaffold184317 115670 115888 - Scaffold184317 116035 116975 - Scaffold184317 117131 117392 - Scaffold184317 117632 123099 - Scaffold184317 123261 134557 - |
![]() |
![]() |
![]() |
#6 |
Member
Location: St Louis, USA Join Date: Nov 2011
Posts: 14
|
![]()
Try regtools - https://github.com/griffithlab/regtools, the `junctions extract` command will do this.
|
![]() |
![]() |
![]() |
Tags |
bedfile, junctions, rna-seq, tophat |
Thread Tools | |
|
|