Seqanswers Leaderboard Ad

**Brian Bushnell** · 10-16-2015, 01:26 PM

Looks like some kind of misassembled collapsed repeat or hypervariable region. The mapping in the area is probably suspect and should be ignored for the purposes of calling variations with respect to that reference.

**ECO** · 10-16-2015, 05:04 PM

Whats the region? Looks like a low-complexity repeat.

**sonia.bao** · 10-16-2015, 05:06 PM

Originally posted by Brian Bushnell View Post

Looks like some kind of misassembled collapsed repeat or hypervariable region. The mapping in the area is probably suspect and should be ignored for the purposes of calling variations with respect to that reference.

Thank you Brian. I was thinking of this too but if that was the case, would it affect both plus and minus strands? I was puzzled by the fact that only one strand is affected!

**sonia.bao** · 10-16-2015, 05:15 PM

Originally posted by ECO View Post

Whats the region? Looks like a low-complexity repeat.

Thank you ECO. The region is chr5:31,526,200-31,526,300 on hg19 assembly. It is a unique region with no repetitive elements.

More updates: I checked another cohort that we sequenced at a different center and on a different date. It is the same!! The minus strand is really bad just for this region. It seems a universal problem.....

**Brian Bushnell** · 10-16-2015, 08:19 PM

There are also certain motifs that interfere with the sequencing enzymes... or so I hear. That can cause sequencing to be unsuccessful in one direction.

But, I think this is a misassembled repeat. The right side does not have totally random errors; rather, there are discrete positions where many reads agree on an alternate allele. Maybe there's a misassembly because it's hard to sequence with any technology due to a structural issue like a hairpin, or being slippery.

**GenoMax** · 10-17-2015, 04:33 AM

Have you thought about trying a re-aligner to see if it improves the alignment. ABRA is one example.

**sonia.bao** · 10-17-2015, 05:18 PM

Originally posted by Brian Bushnell View Post

There are also certain motifs that interfere with the sequencing enzymes... or so I hear. That can cause sequencing to be unsuccessful in one direction.

But, I think this is a misassembled repeat. The right side does not have totally random errors; rather, there are discrete positions where many reads agree on an alternate allele. Maybe there's a misassembly because it's hard to sequence with any technology due to a structural issue like a hairpin, or being slippery.

Thanks Brian. I took a closer look to the samples and indeed those errors are not completely random. They always pop up at the same spot across multiple samples.

As a next step, I took the minus strand sequence from the erroneous region and checked whether it may form certain type of secondary structure:

>chr5:31526227-31526292 strand=-
CGGGAGCGAGGCCGCAGTCCCGACAGGAGAAGACAAGACAGCCGGTACAGATCTGATTATGACCGA

Using this RNA/DNA structure prediction program (http://rna.urmc.rochester.edu/RNAstr...Web/index.html)

The result suggested there is strong second structure forming within this DNA sequence!! Almost all bases have probability >= 80% (chr5_DNA_secondaryStr.sequencingBad.minus.pdf, attached)

I also took the plus strand sequence and the second structure is similar. (chr5_DNA_secondaryStr.sequencingBad.plus.pdf)

As a control, I took DNA sequence of similar length from a region where the sequencing was good:

>chr5:31526292-31526358 strand=-
TATGATGACCACAGGCACCGAGATCACAGTCATGGGCGAGGTGAGAGGCATCGGTCCCTGGATCGGC

And the prediction result suggested certain structure may form but nothing is strong. (chr5_DNA_secondaryStr.sequencingOK.pdf, also attached)

So this could be the reason!

**sonia.bao** · 10-17-2015, 05:27 PM

Here are the predicted DNA secondary structure output files

Attached Files

**sonia.bao** · 10-17-2015, 05:51 PM

Originally posted by GenoMax View Post

Have you thought about trying a re-aligner to see if it improves the alignment. ABRA is one example.

Thanks GenoMax. I was using GATK for indel realignment. ABRA sounds like another good option! How does it compare to GATK?

Topics	Statistics	Last Post
A Close Examination at Probiotic-Related Bacteremia by seqadmin Started by seqadmin, 05-02-2024, 08:06 AM	0 responses 16 views 0 likes	Last Post by seqadmin 05-02-2024, 08:06 AM
Expanded Genetic Insights into Blood Pressure Regulation by seqadmin Started by seqadmin, 04-30-2024, 12:17 PM	0 responses 20 views 0 likes	Last Post by seqadmin 04-30-2024, 12:17 PM
The Role of Enhancers in Defining Cell Fate by seqadmin Started by seqadmin, 04-29-2024, 10:49 AM	0 responses 25 views 0 likes	Last Post by seqadmin 04-29-2024, 10:49 AM
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM

Seqanswers Leaderboard Ad

Announcement

Sequencing failed only on one strand within a specific genomic region

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News