Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stampy - Mapping multiple individuals to genome

    Stampy - Mapping multiple individuals to genome
    Hi everybody,

    I am very very new to this world, so apologize if the following questions are too basic...
    I am trying to use Stampy to map my RAD-seq reads to a reference genome.
    I have double digested, paired end data from multiple individuals of 6 species and the genome of one of those species.

    I created the genome file and the hash table. According to the manual when mapping paired end data to a genome the script should look something like that:

    ./stampy.py -g hg18 -h hg18 -M solexareads_1.fastq solexareads_2.fastq

    My questions are:

    1) is it possible to map multiple individuals to the genome in one go or should I map each individual singularly to it? Could I use for example something like:
    ./stampy.py -g hg18 -h hg18 -M ind1_1.fastq ind1_2.fastq ind2_1.fastq ind2_2.fastq ind3_1.fastq ind3_2.fastq ?

    2) which option can be added to specify the number of mismatches allowed between reads and genome? I didn't understand it from the help...

    Any help would be greatly appreciated

    Many many thanks
    Vivi
    Last edited by vivi7; 03-18-2014, 12:39 AM.

  • #2
    Hi again,

    I am posting the reply I got from Gerton Lunter,
    so that if someone new to the field like me will have same problems will find an answer hopefully

    '1: no, you should map one sample in one go. If you want, you can merge the BAM files afterwards; if you want to do that, you should 'tag' each set of reads with a unique 'read group' identifier; the --help will tell you how to do this.

    2. This is not possible, but also not necessary. Stampy will map reads if it finds a location in the genome that is more similar to the read than would be expected purely by chance for the best-matching locus in a fully random genome of this size -- up to some threshold that is implicitly set by the algorithm, and corresponds to ~10-15% divergence.

    Hope this helps!
    Best wishes

    Gerton'

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Essential Discoveries and Tools in Epitranscriptomics
      by seqadmin


      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
      Yesterday, 07:01 AM
    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    44 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    43 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    38 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-04-2024, 09:00 AM
    0 responses
    55 views
    0 likes
    Last Post seqadmin  
    Working...
    X