Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • mapping parameters for SV/CNV discovery

    Hi!

    I'm begining to work on the SV/CNV dicovery field. I found many different methods to do this. The majority of them take as input mapped reads, but there is few documentation of the parameters that we must be used.

    For methods that analyse the paired mapping abnormalities, it seems that we must return all possible hits, and we must perform single end alignment even if we have paired end reads.
    For methods that analyse split read alignment, it seems that non gapped alignment is required and all hits must be returned.
    Very few programs speak about duplicated reads (must we removed them?), or masking reference genome (should we mask the reference before or after the mapping?).

    I'm not shure that there is one unique mapping process that may be used to all the SV CNV mehods, but have your point of view will may be inspired me!. So what software do you use for mapping and SV/CNV discovery and what parameters do you use for the mapping?

    I will begin my test with BWA ungapped and all possible hits (-n 600 -N 600) and I will see if there is different with the default parameters.

    Best regards

    Maria

  • #2
    Hi.

    I think the answer to your question depends on the coverage you have.

    We have some experience with low coverage CNV detection in tumour samples. For us paired end is not useful (the only benefit is that we can map a few more reads, but it is not worth the extra cost/time) and we only use uniquely mapped reads.

    If you take the ratio of reads in a test and a control, that smooth out a lot of the biases (mailny mappability problem). Also GC correction is very important for some samples. I suspect the paramenters used for the alignant are not so crucial, as long as they are the same for test and control.

    Comment


    • #3
      I am also very much interested in these questions. I have high coverage (~150x) paired end sequence of the yeast genome. Using default parameters in BWA, I seem to be missing most SV data. So far I have tried Retroseq to map insertions of specific elements. I can find some retrotransposons at their reference location, but not all of them, and nothing novel. I have certain gene constructs that I have inserted in the lab, and I cannot find these insertions in the data, but again I find only the endogenous loci.

      Could someone explain a little bit more about the following BWA parameters, or suggest other things to change?

      bwa aln -e INT Maximum number of gap extensions, -1 for k-difference mode (disallowing long gaps) [-1]

      bwa aln -R INT Proceed with suboptimal alignments if there are no more than INT equally best hits. This option only affects paired-end mapping. Increasing this threshold helps to improve the pairing accuracy at the cost of speed, especially for short reads (~32bp).

      bwa sampe -o INT Maximum occurrences of a read for pairing. A read with more occurrneces will be treated as a single-end read. Reducing this parameter helps faster pairing. [100000]

      bwa sampe -n INT Maximum number of alignments to output in the XA tag for reads paired properly. If a read has more than INT hits, the XA tag will not be written. [3]

      bwa sampe -N INT Maximum number of alignments to output in the XA tag for disconcordant read pairs (excluding singletons). If a read has more than INT hits, the XA tag will not be written. [10]

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM
      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      25 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      28 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      24 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      52 views
      0 likes
      Last Post seqadmin  
      Working...
      X