![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Need to extract fastq of specific region in bam file | jmartin | Bioinformatics | 10 | 06-05-2015 12:48 PM |
Counting Reads in BAM file by Region | CodeHippo | Bioinformatics | 4 | 03-24-2014 01:03 PM |
Extract aligned reads from a BAM file above a certain threshold | The Snow | Bioinformatics | 4 | 07-29-2013 03:02 AM |
Too many reads mapping towards intronic region | sanush | SOLiD | 6 | 04-14-2010 04:23 AM |
Too many reads mapping towards intronic region | sanush | RNA Sequencing | 1 | 04-13-2010 11:10 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Lyon Join Date: Dec 2014
Posts: 37
|
![]()
Hello all,
I used picard tool CollectRnaSeqMetrics on my RNA-seq fastq files and I found a lot (> 50%) of intronic region. I wonder if these intronic regions are equally distributed among the genome or otherwise if my reads are aligned one only few genes. How can I see it ? Is there a way to extract only intronic regions from a BAM file ? Or maybe there is a way to extract only reads which mapped on intronic regions and see where they're aligned ? |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,088
|
![]()
Look at the samtools view -L option. Provide your intronic regions as a bed file and extract reads aligned in those areas into a new file.
|
![]() |
![]() |
![]() |
#3 |
Senior Member
Location: Germany Join Date: Apr 2012
Posts: 215
|
![]()
First, depending on the organism you are working with, it's not totally surprising that a lot of reads map to introns.
Second, if you have a lot of genes with different isoforms in your sample, the number of "intronic reads" may be influenced by the annotation file you are using. There is at least one easy way I can think of to get non-exon regions from a BAM file. bedtools intersect (http://bedtools.readthedocs.org/en/l...intersect.html) allows you to select for reads that don't overlap defined regions (-v parameter). But, you will eventually get the inter-gene reads as well... So you would actually need to define gene regions first (maybe from 5'UTR start to 3'UTR end) and then use this file together with an exon defining file in bedtools intersect -v. This will give you an intron-only file which you can use for detecting unequal mapping distributions. |
![]() |
![]() |
![]() |
#4 |
Member
Location: Lyon Join Date: Dec 2014
Posts: 37
|
![]()
Hello,
Thank you guys for your answers, all I have is a GTF file with the transcripts coordinates (3th column contains only "CDS" "exon" "start_codon" and "stop_codon") so bedtools intersect with -v option seems more suitable for me. Do you think a grep "*_codon" on my transcripts file will be do the trick to have only the gene regions ? So if I well understood I should do something like that : bedtools intersect -abam -a myfile.bam -b onlyGeneCoordinate.gtf > bamVSgene.bed #first I want the intersection between my reads and genes bedtools intersect -v -abam -a bamVSgene.bed -b genesAndTranscripts.gtf > intronOnlyFile.bed #filter the exonic regions bedtools intersect -abam -a myfile.bam -b intronOnlyFile.bed > onlyIntrons.bed #finally get only the intronic reads I'm not sure the 2nd step will work because there is not intronic regions in the "genesAndTranscripts.gtf" file. |
![]() |
![]() |
![]() |
#5 |
Junior Member
Location: INDIA Join Date: Aug 2016
Posts: 1
|
![]()
Hi how did you get the intron.bed file? I have gff/gtf file with transcripts/exon/gene/cds information snd trying to parse intronic region. any help ?
|
![]() |
![]() |
![]() |
Thread Tools | |
|
|