Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Resequencing & variant calling issue

    Dear all,

    I have to analyze a set of samples that come from a re-sequencing project.
    I've been given the coordinates where the genes under analysis (those I should carry out a variant calling analysis on) are located.
    Since there are reads that map spuriously to other places in the ref. genome (due to the DNA amplification process), I need to figure out a way to extract only the reads that map to the coordinates I was informed previously.
    -Can anybody tell me about any tool or R package able to perform this reads filtering?
    -Can the filtering step be avoided by any means (e.g. passing the coordinates to a program that would later do de variant calling without the reads filtering step).?
    -Any suggestions about which software I should use to do the variant calling after that filtering (Samtools, GATK, or any other you may know about?

    Thanks in advance

    JL

  • #2
    You can use BEDTools to filter a .bam based on a .bed file of target coordinates.

    So align everything to the whole genome, for best accuracy, then filter for the reads that hit your targets.

    Comment


    • #3
      @ swbarnes2, thank you very much for your kind help.
      I'd like to ask you a question about bedtools if I may. I've been reading the description of the different bed-tools (http://bedtools.readthedocs.org/en/l...ols-suite.html) but I've been unable to find one that fits the excat functions you suggested. The most similar one (to my understanding) was the "intersect" tool but still doesn't seem to do the filtering you told me about. So, would you please tell me the name of the specific bed-tool you suggested me to use.

      Thanks in advance

      JL

      Comment


      • #4
        I use intersect on exome capture data all the time. You give it a .bam file and a .bed of target regions, and it gets rid of reads that don't intersect the .bed regions.

        If that's not what you want, then I don't get your question.

        Comment


        • #5
          Dear swbarnes2,

          You're right in everything, the program I need is intersectBed. I got it wrong because I didn't understand its functions from the online instructions I referred to yesterday. I've downloaded the Bedtools manual pdf and re-read it, and I found that intersectBed was the tool you were talking about (as you made clear in your previous answer). It seems to be exactly what I need.

          Thank you very much

          JL
          Last edited by Jluis; 01-31-2013, 04:08 AM. Reason: syntaxis

          Comment


          • #6
            @swbarnes2

            I've followed your instructions and then included another couple of steps to perform the whole (to my lack of experience with this particular analysis) study.

            Do you consider this a proper workflow for a vcf analysis or do you think there's some missing step on it?

            -Reads Mapping
            -Bedtools Intersect to extract mapped regions of interest
            -SNP/Indels calling using samtools mpileup/bcftools view and bcftools view/vcf utils

            Thanks in advance for your advice

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin




              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
              04-22-2024, 07:01 AM
            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 08:47 AM
            0 responses
            14 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            60 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            60 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            54 views
            0 likes
            Last Post seqadmin  
            Working...
            X