Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Vcf genotypes for RILs

    I was wondering if there is any way to change the assumptions for the genotypes calling in .vcf files from mpileup in samtools. I am working with a diploid organism but the individuals are mostly homozygous recombinant inbred lines with only about 1% residual heterozygosity. (or highly inbred lines). The problem is that when calling a SNP with low coverage (1-3 reads) and only one allele is observed in a sample, it often assumes the individual is heterozygous if the observed allele is the less common allele.

    The problem is that it assumes that all the loci in my individuals are in H-W equilibrium, when in fact due to experimental design they are not anywhere close to being in HW eq and most loci are going to be homozygous. Filtering by the quality on genotype calls reduces the problem but also discards much of the data.

    Of course sequencing to a high depth would solve this question with the existing tools, but when I expect >99% homozygous individuals at each loci that should not be necessary, as one or two A "Reads" should be enough to predict an AA genotype.

  • #2
    Could you post what arguments you are using? This is a question I am very interested in knowing the answer to.

    Comment


    • #3
      I made bowtie2 for alignments, followed by samtools to create sorted bam files.

      mpileup -BuDf Refseq.fa differentsorted.bam(100 separate files) | bcftools view -bvcg - > out.bcf

      bcftools view -N output.bcf > output.vcf

      I have also used vcftools option --geno-depth on the .vcf file but the results are all -1 (missing data).

      I have tried various permutiations in addition with similar results.

      Comment


      • #4
        That's what I figured.....I wonder what would happen if you didn't use the -c argument when you run bcftools. This calls the -e argument which does the test for Hardy-Weinberg Equilibrium:

        Consensus/Variant Calling Options:
        -c Call variants using Bayesian inference. This option automatically invokes option -e.

        -d FLOAT When -v is in use, skip loci where the fraction of samples covered by reads is below FLOAT. [0]

        -e Perform max-likelihood inference only, including estimating the site allele frequency, testing Hardy-Weinberg equlibrium and testing associations with LRT.




        Maybe try instead.
        Code:
        mpileup -BuDf Refseq.fa differentsorted.bam(100 separate files) | bcftools view -bvg - > out.bcf
        I'd be interested to know how this affects the results. I have never run bcftools without the -c argument.

        PS. I see you're in Athens, GA.....if you wouldn't mind I'd like to ask you a few questions. I am starting a post-doc at UGA in Aug.
        Last edited by chadn737; 04-22-2013, 02:38 PM.

        Comment


        • #5
          Thanks so much, guess I have to re-run that 2 week mpileup.

          Comment


          • #6
            Could you run it on one or two files instead and test it? 2 weeks is a long time to try something new out if you don't now what the result will be.

            Comment


            • #7
              Yes planning on doing so. But right now our computer cluster is having disk issues so I don't expect quick results.

              I think that I will have to do the -b option only (not -bvg) on the bcftools view as -v and -g invoke -c.
              Last edited by jebowers; 04-22-2013, 03:22 PM. Reason: x

              Comment


              • #8
                You're right, my bad, I should have read that a bit more closely.

                Comment


                • #9
                  Hi,
                  I got around a similar problem (I'm working with the yeast equivalent of RI lines) by using Freebayes, which has an option for ploidy. This allows you to genotype your RI samples as if they were haploids.
                  However, in my experience, low coverage will result in poor genoype calls.
                  Cheers,

                  Miguel

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Techniques and Challenges in Conservation Genomics
                    by seqadmin



                    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                    Avian Conservation
                    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                    03-08-2024, 10:41 AM
                  • seqadmin
                    The Impact of AI in Genomic Medicine
                    by seqadmin



                    Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
                    02-26-2024, 02:07 PM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 03-14-2024, 06:13 AM
                  0 responses
                  34 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-08-2024, 08:03 AM
                  0 responses
                  72 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-07-2024, 08:13 AM
                  0 responses
                  81 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-06-2024, 09:51 AM
                  0 responses
                  68 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X