![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
BWA: paired end reads, wrong orientation but listed as properly paired | rdeborja | Bioinformatics | 3 | 06-11-2014 04:39 AM |
Running out of disk space with BWA, wrong CMD line? | sindrle | RNA Sequencing | 3 | 11-26-2013 08:14 AM |
Aborted BWA run - what is wrong? | yaximik | Bioinformatics | 1 | 06-10-2013 03:52 AM |
BWA : XM tag is sometimes wrong | Lcontami | Bioinformatics | 3 | 04-01-2013 12:04 PM |
BWA concise format output -edit distance wrong | biterbilen | Bioinformatics | 2 | 11-06-2009 03:55 PM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: USA Join Date: Oct 2019
Posts: 4
|
![]()
Hi,
Below is a supplementary alignment for one of my reads with a mapping quality score of 43.This should map to chr2: 29446797 in hg19 03012p9SX:030B 2064 1 29446797 43 36M241H -1 -1 36 CCCCACGTGAACGAGGGAGGGAGGGAGGGTTGGGTG array('B', [32, 37, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 37, 41, 41, 41, 41, 41, 41, 41, 41, 37, 41, 41]) [('SA', 'chr2,42479618,+,238M39S,60,2;'), ('MD', '26A9'), ('RG', 'SAMPLE.sample23'), ('NM', 1), ('AS', 31), ('XS', 23), ('fm', '3_0')] Since the read is on the negative strand, below is the forward sequence: CACCCAACCCTCCCTCCCTCCCTCGTTCACGTGGGG However this sequence does not map to chr2: 29446797. I tried blasting but I think it is too short so I aligned it to the genomic sequence as you can see here https://ibb.co/R7VHrQH This is not a good alignment at all. Is BWA wrong? Am I missing something? Thank you for your input in advance! |
![]() |
![]() |
![]() |
#2 |
Junior Member
Location: USA Join Date: Oct 2019
Posts: 4
|
![]()
Checking to see if folks have feedback on this. I read that "BWA actually follows the SAM spec and reports Phred scores as MAPQ values.* The calculation is based on the number of optimal (best) alignments found, as well as the number of sub-optimal alignments combined with the Phred scores of the bases which differ between the optimal and sub-optimal alignments."
Does that mean that the mapq score of a clipped read is a combination of the quality of the 'matched' alignment AND the quality of the alignment of the clipped bases? Perhaps this explains the strange finding I saw above. If so, how does one extract the mapping quality of only the 'matched' portion of the read at this location (29446797)? |
![]() |
![]() |
![]() |
#3 |
Member
Location: Brazil Join Date: Aug 2015
Posts: 91
|
![]()
Hi! Yes you can BLAST it, but since it's a short sequence it's best if you use the `blastn` algorithm instead of the `megablast` one. You can also decrease the word size to 7 to increase sensitivity.
Here's your search, it'll be available until october 31: https://blast.ncbi.nlm.nih.gov/Blast...ID=VGW43RPD014 According to the BLAST results, the sequence matches chr2:29446798-29446833 (hg19). ...PS anyways the CIGAR string says "36M" but the alignment to GRCh37.p13 shows one mismatch. ...PPS oh, I just noticed - I know where you got it wrong! Bit 16 in the SAM flag does not simply mean "the read is aligned to the reverse strand", it means "the read is represented as its reverse-complement because it aligned to the reverse strand". Sequences in SAM files are always represented on the forward strand. So if bit 16 is 1, as in this case, it means that "CCCC..." is ALREADY the reverse-complement of the actual read. So you don't have to reverse-complement it again. In fact, you can find your sequence in the same screenshot you posted. Last edited by r.rosati; 10-29-2019 at 10:46 AM. |
![]() |
![]() |
![]() |
#4 |
Junior Member
Location: USA Join Date: Oct 2019
Posts: 4
|
![]()
Wow I do see it now!!! Thank you so very much Rosati. Yes this is super helpful and it makes a lot of sense
![]() ![]() ![]() I didn't realize that reads in the SAM file always represent the forward strand. How about cases when I check read.get_forward_sequence in pysam, sometimes the output is the reverse complement of the sequence in the read itself, like this example: Read: 0a023638:0B0B 83 0 9999 0 144M 0 10005 144 CTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCAACCCTAACCCTAACCCTAACCCTAACCCTAACCC array('B', [12, 12, 41, 37, 32, 32, 32, 32, 41, 37, 37, 37, 22, 32, 27, 12, 37, 32, 27, 32, 27, 27, 41, 41, 27, 27, 32, 32, 32, 32, 32, 27, 37, 37, 41, 37, 37, 37, 41, 41, 41, 41, 41, 41, 37, 41, 41, 41, 41, 41, 41, 37, 41, 41, 37, 41, 41, 27, 37, 41, 41, 37, 41, 37, 41, 41, 41, 41, 37, 32, 41, 32, 41, 37, 32, 22, 41, 41, 41, 41, 41, 37, 41, 41, 41, 41, 37, 27, 41, 41, 27, 37, 37, 12, 41, 41, 41, 37, 41, 37, 41, 41, 41, 37, 37, 27, 41, 37, 22, 37, 37, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 37, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 37, 41, 37, 27]) [('MD', '0A143'), ('RG', 'HMMGHBBXX.lane0.2P_FMIEx_321'), ('NM', 1), ('AS', 143), ('XS', 137)] But read.get_forward_sequence() is: GGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAG Thank you so much! |
![]() |
![]() |
![]() |
Tags |
alignment, bwa, read |
Thread Tools | |
|
|