Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
CNV from only one sample m_elena_bioinfo Bioinformatics 8 01-18-2016 08:20 AM
CNV between twins... milesgr General 9 05-31-2011 11:33 AM
Blat parameters for mapping short sequences clatrinajes Bioinformatics 2 05-19-2011 04:28 AM
RNA-Seq: MapSplice: Accurate mapping of RNA-seq reads for splice junction discovery. Newsbot! Literature Watch 2 10-14-2010 09:35 AM
CNV Error: JohnK SOLiD 1 08-09-2010 09:38 AM

Thread Tools
Old 04-26-2012, 04:44 AM   #1
Location: Paris

Join Date: Sep 2009
Posts: 14
Default mapping parameters for SV/CNV discovery


I'm begining to work on the SV/CNV dicovery field. I found many different methods to do this. The majority of them take as input mapped reads, but there is few documentation of the parameters that we must be used.

For methods that analyse the paired mapping abnormalities, it seems that we must return all possible hits, and we must perform single end alignment even if we have paired end reads.
For methods that analyse split read alignment, it seems that non gapped alignment is required and all hits must be returned.
Very few programs speak about duplicated reads (must we removed them?), or masking reference genome (should we mask the reference before or after the mapping?).

I'm not shure that there is one unique mapping process that may be used to all the SV CNV mehods, but have your point of view will may be inspired me!. So what software do you use for mapping and SV/CNV discovery and what parameters do you use for the mapping?

I will begin my test with BWA ungapped and all possible hits (-n 600 -N 600) and I will see if there is different with the default parameters.

Best regards

maria.b is offline   Reply With Quote
Old 04-27-2012, 03:14 AM   #2
Location: Cambridge area, UK

Join Date: Jan 2010
Posts: 35


I think the answer to your question depends on the coverage you have.

We have some experience with low coverage CNV detection in tumour samples. For us paired end is not useful (the only benefit is that we can map a few more reads, but it is not worth the extra cost/time) and we only use uniquely mapped reads.

If you take the ratio of reads in a test and a control, that smooth out a lot of the biases (mailny mappability problem). Also GC correction is very important for some samples. I suspect the paramenters used for the alignant are not so crucial, as long as they are the same for test and control.
stefanoberri is offline   Reply With Quote
Old 12-11-2012, 04:16 PM   #3
Junior Member
Location: Boston

Join Date: Dec 2012
Posts: 6

I am also very much interested in these questions. I have high coverage (~150x) paired end sequence of the yeast genome. Using default parameters in BWA, I seem to be missing most SV data. So far I have tried Retroseq to map insertions of specific elements. I can find some retrotransposons at their reference location, but not all of them, and nothing novel. I have certain gene constructs that I have inserted in the lab, and I cannot find these insertions in the data, but again I find only the endogenous loci.

Could someone explain a little bit more about the following BWA parameters, or suggest other things to change?

bwa aln -e INT Maximum number of gap extensions, -1 for k-difference mode (disallowing long gaps) [-1]

bwa aln -R INT Proceed with suboptimal alignments if there are no more than INT equally best hits. This option only affects paired-end mapping. Increasing this threshold helps to improve the pairing accuracy at the cost of speed, especially for short reads (~32bp).

bwa sampe -o INT Maximum occurrences of a read for pairing. A read with more occurrneces will be treated as a single-end read. Reducing this parameter helps faster pairing. [100000]

bwa sampe -n INT Maximum number of alignments to output in the XA tag for reads paired properly. If a read has more than INT hits, the XA tag will not be written. [3]

bwa sampe -N INT Maximum number of alignments to output in the XA tag for disconcordant read pairs (excluding singletons). If a read has more than INT hits, the XA tag will not be written. [10]
ryanmcg is offline   Reply With Quote

cnv, mapping, structural variation

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 03:42 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO