Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    and ANNOVAR



    and SeattleSeq Annotation



    both for annotation of SNP variants. ie. Nonsense, synonymous, splice, intronic, found in 1000 genomes, or HapMap frequency, which gene, amino acid change (from and too, with position) etc.

    Comment


    • #32
      Hi,
      Most of the tools related to variant annotations seem to be related to analysing human data. I work on a plant species where we only have the draft genome and not annotated yet. I have done an rna-seq experiment and I have found several snps using samtools. I got a gtf file by aligning reads against the genome sequence with bowtie and tophat. The gtf file only has transcirpt information and no orf or CDS information. Does any one have a script which takes the positions of the snps, annotations from a gtf file and the genome sequence in fasta format and predict if the snps are synonymous or non-synonymous?

      Comment


      • #33
        Does Dindel also do anchor-split mapping, as Pindel? Or the indels discovered by Dindel has to be supported by at least one mappable reads by the aligners, such as bwa/novoalign?

        Comment


        • #34
          GAMES (according to the paper) uses MySQL queries to go against UCSC; presumably you could easily reroute that to your own data.

          A quick look at ANNOVAR suggests it is all flat file based; create your own flat files in the right format and it should work.

          Comment


          • #35
            Originally posted by qqcandy View Post
            Does Dindel also do anchor-split mapping, as Pindel? Or the indels discovered by Dindel has to be supported by at least one mappable reads by the aligners, such as bwa/novoalign?
            You'd better run Pindel first and then Dindel.

            Comment


            • #36
              Hi Heng,

              Originally posted by lh3 View Post
              3. Cap base quality BAQ (with samtools).
              What do you mean by "cap base quality"?

              Can you give more detailed suggestions about how to effectively use BAQ?

              We would like to include it in our pipeline but are unsure about how best to utilize it.

              Thanks.
              Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
              Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
              Projects: U87MG whole genome sequence [Website] [Paper]

              Comment


              • #37
                Originally posted by Michael.James.Clark View Post
                Hi Heng,
                What do you mean by "cap base quality"?

                Can you give more detailed suggestions about how to effectively use BAQ?

                We would like to include it in our pipeline but are unsure about how best to utilize it.
                Thanks.
                Here.
                -drd

                Comment


                • #38
                  Originally posted by drio View Post
                  Woosh, thanks. For some reason I was thinking there was more to it (probably because I've been doing half a dozen different approaches each with ten times as many parameters than the BAQ step).
                  Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
                  Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
                  Projects: U87MG whole genome sequence [Website] [Paper]

                  Comment


                  • #39
                    usage of samtools calmd for calculating BAQ

                    Originally posted by drio View Post
                    I just did what was recommended "here" and got the SVN version of samtools that should contain the calmd that calculates BAQ. But now I'm very confused about the parameters when I call this new samtools_svn_816 calmd [Version: 0.1.9-10 (r816)]:

                    Usage: samtools fillmd [-eubrS] <aln.bam> <ref.fasta>

                    Options: -e change identical bases to '='
                    -u uncompressed BAM output (for piping)
                    -b compressed BAM output
                    -S the input is SAM with header
                    -r read-independent local realignment

                    i.e. the same as in Version 0.1.9 (r783)

                    Acoording to the manual, there is a difference (highlighted in red):

                    calmd samtools calmd [-eubSr] [-C capQcoef] <aln.bam> <ref.fasta>

                    OPTIONS:
                    -e Convert a the read base to = if it is identical to the aligned reference base. Indel caller does not support the = bases at the moment.
                    -u Output uncompressed BAM
                    -b Output compressed BAM
                    -S The input is SAM with header lines
                    -C" INT" Coefficient to cap mapping quality of poorly mapped reads. See the pileup command for details. [0]
                    -r Perform probabilistic realignment to compute BAQ, which will be used to cap base quality.

                    Can anyone enlighten me whether just the usage message is not up to date or whether there is another version in the SVN?

                    Thank you in advance

                    Barbara

                    Comment


                    • #40
                      Originally posted by Michael.James.Clark View Post
                      What tools do people use for coding consequence determination after all of this?
                      This morning Dongliang Ge demonstrated his SequenceVariantAnalyzer at our institute. Sounds promising. It has very nice viewing functionality, but uses lots of memory.

                      I am not the person doing these analyses. People here use PolyPhen, SeattleSeq, Sift, I don't know what else.

                      Comment


                      • #41
                        Originally posted by Bruins View Post
                        This morning Dongliang Ge demonstrated his SequenceVariantAnalyzer at our institute. Sounds promising. It has very nice viewing functionality, but uses lots of memory.

                        I am not the person doing these analyses. People here use PolyPhen, SeattleSeq, Sift, I don't know what else.
                        We have used it and I liked it, but unfortunately when we tried it a couple months back, it only took hg18 and annotated SNPs using dbSNP127, which is a little outdated by now. Still a very nice program, though.
                        Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
                        Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
                        Projects: U87MG whole genome sequence [Website] [Paper]

                        Comment

                        Latest Articles

                        Collapse

                        • seqadmin
                          Essential Discoveries and Tools in Epitranscriptomics
                          by seqadmin




                          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                          04-22-2024, 07:01 AM
                        • seqadmin
                          Current Approaches to Protein Sequencing
                          by seqadmin


                          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                          04-04-2024, 04:25 PM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by seqadmin, 04-11-2024, 12:08 PM
                        0 responses
                        59 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-10-2024, 10:19 PM
                        0 responses
                        57 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-10-2024, 09:21 AM
                        0 responses
                        51 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-04-2024, 09:00 AM
                        0 responses
                        56 views
                        0 likes
                        Last Post seqadmin  
                        Working...
                        X