SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Write the subset of reads from BAM file into new SAM/BAM file, using R tools. Old Pioneer Bioinformatics 0 01-27-2016 05:41 AM
grep only reads from bam file papori Bioinformatics 10 01-27-2014 04:36 AM
BAM file containing ALL reads, next step? chipnewb Bioinformatics 0 01-31-2013 08:50 AM
Unmapping reads in bam file bgu Bioinformatics 0 07-20-2012 06:01 AM
Getting Reads from Bam file empyrean Bioinformatics 4 10-12-2011 01:57 PM

Reply
 
Thread Tools
Old 03-28-2016, 07:38 PM   #1
Alphabets
Junior Member
 
Location: SZ

Join Date: Mar 2016
Posts: 7
Default Three reads with the same name in the BAM file

Hi all,

I am dealing with the paired-end BAM file, and come up with many warnings like this:

Code:
WARNING: Could not find pair for HWI-ST430:177:2:1:4979:15503#0
WARNING: Could not find pair for HWI-ST430:177:2:1:5127:13427#0
WARNING: Could not find pair for HWI-ST430:177:2:1:6521:21452#0
I check the warning reads in the BAM file, and find all the warning reads have three reads with the same name. For example:

Code:
HWI-ST430:177:2:1:4979:15503#0	65	chr32	26100696	60	79M21S	chr5	36697147	0	ACTTTGCAATTTAAGTTTTACTTACTTTTTAACTAATATACATGCCTAAAATTTACAAAAACAATAATAAAAACAACAGAACACTGGAAACATTTTTAAA	>;=<>=<<=======<====;===;=======<=>>>>>><=>>==>>>>=>>>>==>?>=<<==>?>>>?>?==><=?>><=<>>>?>?=>??>?===>	BD:Z:[email protected][email protected]@@[email protected]@	MD:Z:79	PG:Z:MarkDuplicates	RG:Z:Basenji	BI:Z:FFIECHGIHFEAFEEHEAAFFHDFFHDAAAFEEIHFGGHGGGHHGHHHFBBGFBGGGHBBBFGHGGFGGFBBBGHIGHJGHGHFKJJJJEIKLJGHBGFB	NM:i:0	AS:i:79	XS:i:19
HWI-ST430:177:2:1:4979:15503#0	129	chr5	36697147	60	72M28S	chr32	26100696	0	ATTTGCCCCTGGGCTATTTTTTTCCTNCCATGTAAGATTCCGTTTTAAAAATGTTTCCAGTGTTCTGTTGTTTTTATTATTGTTTTTGTAAATTTTAGGC	===<=<<<<====<=>========<<!<<<=><<=>>>>>=5=>>>>>>>>>>=>>>==>=>=>>>>=?>=>>>>>>>>=?>=>>>?>>>??>??>;<=>	SA:Z:chr32,26100739,-,36M64S,60,0;	BD:Z:[email protected][email protected]@@EGGEGGGFHAAAHGJHBJJDDEHHI	MD:Z:26T37T7	PG:Z:MarkDuplicates	RG:Z:Basenji	BI:Z:FFFBHHHFFHGGDGHGGEAAAAADFGEEEIHHGHFFFGFEGHHFBBGFBBBGHGFBEGIIIFGFEFHGFHHGCCCHIGHIGHHGDDDIIKIFKJGHGHGH	NM:i:2	AS:i:65	XS:i:21
HWI-ST430:177:2:1:4979:15503#0	401	chr32	26100739	60	36M64H	=	26100696	-79	GCCTAAAATTTACAAAAACAATAATAAAAACAACAG	===<=>>=>>===>===<=>===========>;===	SA:Z:chr5,36697147,+,72M28S,60,2;	BD:Z:[email protected]@AHHIJFIFF	MD:Z:36	PG:Z:MarkDuplicates	RG:Z:Basenji	BI:Z:HGHGBBFFAEGFFAAAEFFEGFEGFABBFGHGGHFF	NM:i:0	AS:i:36	XS:i:22
The BAM file is alignment of HiSeq reads aligned to the reference genome using bwa, and use picard to remove redundancy. Base realignments were done using gatk.


My confusion is:
1、Why there are three reads with the same name, but have no relation?
2、Maybe the first two are treated as mate pairs and the third as a single read. So could I just ignore it?

Could eveyone help me? Many thanks for your help!
Alphabets is offline   Reply With Quote
Old 03-28-2016, 08:05 PM   #2
Richard Finney
Senior Member
 
Location: bethesda

Join Date: Feb 2009
Posts: 688
Default

3rd read flag value 401 has not primary alignment bit.

2nd read has "SA" tag:
SA is : Other canonical alignments in a chimeric alignment, formatted as a semicolon-delimited list: ( rname , pos , strand , CIGAR , mapQ , NM [[...]+. Each element in the list represents a part of the chimeric alignment. Conventionally, at a supplementary line, the [...] element points to the primary line.

it's pointing to 3rd read via the location.

So, looks like your software suppors reads that have parts that maps to different locations.
Richard Finney is offline   Reply With Quote
Old 03-28-2016, 08:30 PM   #3
Alphabets
Junior Member
 
Location: SZ

Join Date: Mar 2016
Posts: 7
Default

Quote:
Originally Posted by Richard Finney View Post
3rd read flag value 401 has not primary alignment bit.

2nd read has "SA" tag:
SA is : Other canonical alignments in a chimeric alignment, formatted as a semicolon-delimited list: ( rname , pos , strand , CIGAR , mapQ , NM [[...]+. Each element in the list represents a part of the chimeric alignment. Conventionally, at a supplementary line, the [...] element points to the primary line.

it's pointing to 3rd read via the location.

So, looks like your software suppors reads that have parts that maps to different locations.
Thank you for your reply!

I read your reply carefully but there is some difficulty for me to understand.

Could you explain the three reads more easy to understand? or how can I solve the warnings "Could not find pair for HWI-ST430:177:2:1:4979:15503#0".

Thank you very much!

Last edited by Alphabets; 03-28-2016 at 08:51 PM.
Alphabets is offline   Reply With Quote
Old 03-28-2016, 09:57 PM   #4
Richard Finney
Senior Member
 
Location: bethesda

Join Date: Feb 2009
Posts: 688
Default

What is your goal?

What program is reporting the warning?

Check the manual for your alignment software and check the notes on when it produces an "SA" tag.

Read one is one mate pair.
The next two represent the other read with two entries , that is it is a "chimeric" read [ I think ].

Ignoring it could be thing to do, depending on your goals.

If you are looking for chimeric reads or possible errors in the reference, then you have struck gold
Richard Finney is offline   Reply With Quote
Old 03-28-2016, 11:22 PM   #5
Alphabets
Junior Member
 
Location: SZ

Join Date: Mar 2016
Posts: 7
Default

Quote:
Originally Posted by Richard Finney View Post
What is your goal?

What program is reporting the warning?

I want to call STRs with lobSTR dealing with the BAM file.

I run lobSTR with the paired-end BAM file and it occurs many warnings like that.

The BAM file I use is downloaded from web and I don't know more about it.

When I run lobSTR treating it as the single-end BAM file, there is no warnings.
The lobSTR to run single-end and single-end BAM file have different parameters.

So, any other suggestions? Thanks!

Last edited by Alphabets; 03-28-2016 at 11:27 PM.
Alphabets is offline   Reply With Quote
Reply

Tags
bioinfomatics, hiseq, mate pair, sequencing, wgs mapping

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:28 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO