SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
Bismark - A New Tool for Mapping and Analysis of Bisulfite-Seq Data fkrueger Bioinformatics 649 10-05-2018 01:43 AM
454 data analysis & Mapping Abishai3911 Bioinformatics 3 07-03-2011 02:27 AM
Very Bad Mapping Results with several mapping softwares xquan Bioinformatics 13 05-22-2011 11:31 PM
PubMed: Optical mapping of DNA: Single-molecule-based methods for mapping genomes. Newsbot! Literature Watch 0 01-06-2011 02:00 AM
Why do we use mapping programs instead of blast for mapping to a reference? thsuk1 Bioinformatics 6 08-27-2010 08:54 AM

Reply
 
Thread Tools
Old 04-16-2011, 05:19 AM   #1
Huijuan
Junior Member
 
Location: Beijng, China

Join Date: Apr 2011
Posts: 9
Question analysis after mapping

Hi,I'm a beginner in RNA-Seq analysis,now I've mapped the short reads with Bowtie,TopHat and other softwares,I'm trying to analyze the mapping efficiency by counting the uniquely mapped reads,spanning junction reads and unmappable reads in different mapping results.But how can I get the information about which one is uniquely mapped,which one is mutiple reads from the SAM format file I get with the softwares?
Thank you for your attention.
Huijuan is offline   Reply With Quote
Old 04-16-2011, 09:15 AM   #2
upendra_35
Senior Member
 
Location: USA

Join Date: Apr 2010
Posts: 102
Default

Quote:
Originally Posted by Huijuan View Post
Hi,I'm a beginner in RNA-Seq analysis,now I've mapped the short reads with Bowtie,TopHat and other softwares,I'm trying to analyze the mapping efficiency by counting the uniquely mapped reads,spanning junction reads and unmappable reads in different mapping results.But how can I get the information about which one is uniquely mapped,which one is mutiple reads from the SAM format file I get with the softwares?
Thank you for your attention.
First you have to separate the Aligned and unaligned reads in the SAM file. For this you can use Picard:
For Aligned reads :
java -jar /usr/local/bin/ViewSam.jar INPUT=your_mapped_reads.sam ALIGNMENT_STATUS=Aligned PF_STATUS=All > your_Aligned_reads.sam
For Unaligned reads:
java -jar /usr/local/bin/ViewSam.jar INPUT=your_mapped_reads.sam ALIGNMENT_STATUS=Unaligned PF_STATUS=All > your_Unaligned_reads.sam

You can then count the rows in each file that don't start with @ with:
grep -cv "^@" your_Unaligned_reads.sam
Should give you the number of unaligned reads.

For the uniquely mapped reads: there should have this tag in the sam file: XT:A:U, so to count them:
grep -c "XT:A:U" your_Aligned_reads.sam
The reads that map to more than one position should be have "XT:A:R", so:
grep -c "XT:A:R" your_Aligned_reads.sam

Hope this helps.....
upendra_35 is offline   Reply With Quote
Old 04-16-2011, 05:11 PM   #3
Huijuan
Junior Member
 
Location: Beijng, China

Join Date: Apr 2011
Posts: 9
Default

Quote:
Originally Posted by upendra_35 View Post
First you have to separate the Aligned and unaligned reads in the SAM file. For this you can use Picard:
For Aligned reads :
java -jar /usr/local/bin/ViewSam.jar INPUT=your_mapped_reads.sam ALIGNMENT_STATUS=Aligned PF_STATUS=All > your_Aligned_reads.sam
For Unaligned reads:
java -jar /usr/local/bin/ViewSam.jar INPUT=your_mapped_reads.sam ALIGNMENT_STATUS=Unaligned PF_STATUS=All > your_Unaligned_reads.sam

You can then count the rows in each file that don't start with @ with:
grep -cv "^@" your_Unaligned_reads.sam
Should give you the number of unaligned reads.

For the uniquely mapped reads: there should have this tag in the sam file: XT:A:U, so to count them:
grep -c "XT:A:U" your_Aligned_reads.sam
The reads that map to more than one position should be have "XT:A:R", so:
grep -c "XT:A:R" your_Aligned_reads.sam

Hope this helps.....
Many thanks for helping.I did as what you said and got what I want.Thank you again
Huijuan
Huijuan is offline   Reply With Quote
Old 04-19-2011, 07:47 PM   #4
Huijuan
Junior Member
 
Location: Beijng, China

Join Date: Apr 2011
Posts: 9
Default

Here's the problem
I use the way you told me with the mapping results with BWA. And the total number of unaligned reads, uniquely mapped reads and multireads is not equal to the total number of short reads and far less than it. What's the problem ?
Thank you again
Huijuan
Huijuan is offline   Reply With Quote
Old 04-26-2011, 04:24 PM   #5
upendra_35
Senior Member
 
Location: USA

Join Date: Apr 2010
Posts: 102
Default

Quote:
Originally Posted by Huijuan View Post
Here's the problem
I use the way you told me with the mapping results with BWA. And the total number of unaligned reads, uniquely mapped reads and multireads is not equal to the total number of short reads and far less than it. What's the problem ?
Thank you again
Huijuan
Do you get any error when you run the way i mentioned before? If yes then try running the modified way which is below. It should fix the problem. Good luck

#To separate Aligned and unaligned reads I am using Picard now:
#Aligned:
java -jar /usr/local/bin/ViewSam.jar VALIDATION_STRINGENCY=LENIENT INPUT=your_mapped_reads.sam ALIGNMENT_STATUS=Aligned PF_STATUS=All > your_Aligned_reads.sam
#Unaligned:
java -jar /usr/local/bin/ViewSam.jar VALIDATION_STRINGENCY=LENIENT INPUT=your_mapped_reads.sam ALIGNMENT_STATUS=Unaligned PF_STATUS=All > your_Unaligned_reads.sam

#You can then count the rows in each file that don't start with @ should give you the number of unaligned reads.:
grep -cv "^@" your_Unaligned_reads.sam

#For the uniquely mapped reads: they should have this tag in the sam file: XT:A:U, so to count them:
grep -c "XT:A:U" your_Aligned_reads.sam
#The reads that map to more than one position should be have "XT:A:R", so:
grep -c "XT:A:R" your_Aligned_reads.sam
upendra_35 is offline   Reply With Quote
Old 04-26-2011, 04:41 PM   #6
Huijuan
Junior Member
 
Location: Beijng, China

Join Date: Apr 2011
Posts: 9
Default

Quote:
Originally Posted by upendra_35 View Post
Do you get any error when you run the way i mentioned before? If yes then try running the modified way which is below. It should fix the problem. Good luck

#To separate Aligned and unaligned reads I am using Picard now:
#Aligned:
java -jar /usr/local/bin/ViewSam.jar VALIDATION_STRINGENCY=LENIENT INPUT=your_mapped_reads.sam ALIGNMENT_STATUS=Aligned PF_STATUS=All > your_Aligned_reads.sam
#Unaligned:
java -jar /usr/local/bin/ViewSam.jar VALIDATION_STRINGENCY=LENIENT INPUT=your_mapped_reads.sam ALIGNMENT_STATUS=Unaligned PF_STATUS=All > your_Unaligned_reads.sam

#You can then count the rows in each file that don't start with @ should give you the number of unaligned reads.:
grep -cv "^@" your_Unaligned_reads.sam

#For the uniquely mapped reads: they should have this tag in the sam file: XT:A:U, so to count them:
grep -c "XT:A:U" your_Aligned_reads.sam
#The reads that map to more than one position should be have "XT:A:R", so:
grep -c "XT:A:R" your_Aligned_reads.sam
Ive figured out the reason.Thanks so much for helping.

Last edited by Huijuan; 04-26-2011 at 05:00 PM.
Huijuan is offline   Reply With Quote
Old 04-26-2011, 04:46 PM   #7
upendra_35
Senior Member
 
Location: USA

Join Date: Apr 2010
Posts: 102
Default

Quote:
Originally Posted by Huijuan View Post
Ive figure out the reason.Thanks so much for helping.
Can you share with me how you fixed the problem? Thanks
upendra_35 is offline   Reply With Quote
Old 04-26-2011, 04:57 PM   #8
Huijuan
Junior Member
 
Location: Beijng, China

Join Date: Apr 2011
Posts: 9
Default

This time I didn't use viewSam.jar and just used awk and grep to get them apart and then did counting
Huijuan is offline   Reply With Quote
Old 04-28-2011, 03:44 AM   #9
brpet
Junior Member
 
Location: Europe

Join Date: Apr 2011
Posts: 4
Default

Quote:
#For the uniquely mapped reads: they should have this tag in the sam file: XT:A:U, so to count them:
grep -c "XT:A:U" your_Aligned_reads.sam
#The reads that map to more than one position should be have "XT:A:R", so:
grep -c "XT:A:R" your_Aligned_reads.sam
Both of these grep's return 0 for me.


Are you doing any other steps between tophat and changing the file from bam to sam?
brpet is offline   Reply With Quote
Old 04-28-2011, 04:20 AM   #10
ttnguyen
Member
 
Location: Ireland

Join Date: Mar 2010
Posts: 41
Default

I used awk and grep for this task - just need to base on the 'read name' and CIGAR in sam file.
ttnguyen is offline   Reply With Quote
Old 04-28-2011, 08:43 AM   #11
upendra_35
Senior Member
 
Location: USA

Join Date: Apr 2010
Posts: 102
Default

Quote:
Originally Posted by brpet View Post
Both of these grep's return 0 for me.


Are you doing any other steps between tophat and changing the file from bam to sam?
Yes i do the following to convert the bam to sam

samtools view accepted_hits.bam > accepted_hits.sam
samtools view -H accepted_hits.bam > header.txt
cat header.txt accepted_hits.sam > accepted_hits2.sam

But you can't use this sam file for counting the number of reads that are mapped to the reference or not as this sam is made from bam and bam file is supposed to contain only mapped reads.

I am still to figure out how to use tophat generated bam file for counting the number of reads that are mapped to the reference or not. I will let you know once i know it.
upendra_35 is offline   Reply With Quote
Old 04-28-2011, 10:31 PM   #12
Huijuan
Junior Member
 
Location: Beijng, China

Join Date: Apr 2011
Posts: 9
Default

Quote:
Originally Posted by upendra_35 View Post
Yes i do the following to convert the bam to sam

samtools view accepted_hits.bam > accepted_hits.sam
samtools view -H accepted_hits.bam > header.txt
cat header.txt accepted_hits.sam > accepted_hits2.sam

But you can't use this sam file for counting the number of reads that are mapped to the reference or not as this sam is made from bam and bam file is supposed to contain only mapped reads.

I am still to figure out how to use tophat generated bam file for counting the number of reads that are mapped to the reference or not. I will let you know once i know it.
there‘s a method saying that you can keep the tmp files when you run Tophat by the parameter --keep-tmp then you will get both sam and bam files
Huijuan is offline   Reply With Quote
Old 04-29-2011, 05:17 AM   #13
kmcarr
Senior Member
 
Location: USA, Midwest

Join Date: May 2008
Posts: 1,178
Default

Quote:
Originally Posted by upendra_35 View Post
Yes i do the following to convert the bam to sam

samtools view accepted_hits.bam > accepted_hits.sam
samtools view -H accepted_hits.bam > header.txt
cat header.txt accepted_hits.sam > accepted_hits2.sam
You can output the header with the alignments to your SAM file in one step, avoiding long cat job. The '-h' option for samtools view will output the header in addition to the alignments.

Code:
% samtools view -h accepted_hits.bam -o accepted_hits.sam
The -o option is used to give the output file name (or redirect stdout as shown in upendra's example).
kmcarr is offline   Reply With Quote
Reply

Tags
sam reads

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:37 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO