Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • samtools pileup for multiple diploid individuals?

    Is it appropriate to use samtools pileup (which uses maq's consensus- and SNP-calling model) on pooled reads from multiple diploid individuals? I'm looking for SNPs both within the read population, and between the reads and the reference. I've got 6 individuals in separate samples from a species close to the reference, at low enough depth where I should probably just forget about mapping each individual separately (in other words, I can forget about calling a genotype for each individual). But I'd still like to call the most likely consensus and SNP for this population ...

    Comment


    • Does any one know about any tool/program to convert SAM output to BED format, if so please let me know

      Comment


      • Originally posted by zee View Post
        Is there a way to convert a SAM consensus output (using -c option for pileup) to the old maq-style .cns consensus?

        I have some maq-based pipelines I would like to use on my BWA results.
        Has anyone had any luck with this? In addition, in MAQ, you can dump unmapped reads into a separate file, is there such a function/tool in samtools? Thank you.

        Comment


        • bekkari:

          I believe ConvertToBed.jar in the Vancouver Short Read Analysis Package can do it.

          Anthony
          The more you know, the more you know you don't know. —Aristotle

          Comment


          • if you use BOWTIE as an alignment algorithm, there is an option (--un) to dump all unmapped reads into a file.

            Comment


            • Originally posted by bekkari View Post
              if you use BOWTIE as an alignment algorithm, there is an option (--un) to dump all unmapped reads into a file.

              But I'm using BWA. Is there an unmapped flag in the sam file?

              Comment


              • Partial pileup for samtools?

                samtools pileup takes ages, so does varfilter.

                Can samtools pileup work on one chromosome? It would be
                must easier for parallelization.

                Comment


                • Comment


                  • properly mapped Flag

                    Perhaps I have misunderstood it but isn't right that properly mapped flag (P string) are only used when read pairs are mapped to the same chromosome with correct insert size? I have about 6% of the properly mapped reads with the P string flag that have mate mapped to different chromosome with 0 insert size as shown below. Has anyone seen this before?

                    EBRI093151:1:90:555:299#0 pPR1 Chr11 107308221 23 36M Chr12 49 0 TATCCTATTCGAAAGTCGCCATGACCGTGGACATGA BCCBCBBACCB?CBA@BBACCBCAAB<6<?BABBBB XT:A:U NM:i:0 SM:i:23 AM:i:23 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:36

                    EBRI093151:1:90:555:299#0 pPr2 Chr12 49 23 11S7M1D12M6S Chr11 107308221 0 CTACCGCTTGGGTGGTCATGAATGATTAGCACGCCC AB99@B=BBA>ACCBCCBBCBBCBBBBCBCBB@B@A XT:A:M NM:i:4 SM:i:23 AM:i:23 XM:i:3 XO:i:1 XG:i:1 MD:Z:3A3^T2T5C3

                    Comment


                    • Try the latest version of bwa.

                      Comment


                      • Originally posted by lh3 View Post
                        Try the latest version of bwa.
                        Heng,

                        This was done with BWA 0.5.4.

                        For resequencing project, does it really matter if the mates are not properly mapped? Can I instead just filter out those reads with low mapping qualities?

                        Thank you.

                        Comment


                        • Originally posted by zlu View Post
                          Perhaps I have misunderstood it but isn't right that properly mapped flag (P string) are only used when read pairs are mapped to the same chromosome with correct insert size? I have about 6% of the properly mapped reads with the P string flag that have mate mapped to different chromosome with 0 insert size as shown below. Has anyone seen this before?

                          EBRI093151:1:90:555:299#0 pPR1 Chr11 107308221 23 36M Chr12 49 0 TATCCTATTCGAAAGTCGCCATGACCGTGGACATGA BCCBCBBACCB?CBA@BBACCBCAAB<6<?BABBBB XT:A:U NM:i:0 SM:i:23 AM:i:23 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:36

                          EBRI093151:1:90:555:299#0 pPr2 Chr12 49 23 11S7M1D12M6S Chr11 107308221 0 CTACCGCTTGGGTGGTCATGAATGATTAGCACGCCC AB99@B=BBA>ACCBCCBBCBBCBBBBCBCBB@B@A XT:A:M NM:i:4 SM:i:23 AM:i:23 XM:i:3 XO:i:1 XG:i:1 MD:Z:3A3^T2T5C3
                          I don't see anywhere in the specification how to set the "properly paired" bit. I would guess this is aligner dependent.

                          Comment


                          • varFilter out put

                            Hi,

                            I have used samtools to analyse variations using varFilter. So I have imported an alignment file from BWA in sam format, have sorted and run:
                            1. samtools pileup -vcf ...
                            2. samtools.pl varFilter...| awk '$6>=20' ...

                            It did run but I have problems to interpret all the columns. What I think is:
                            column 1: chromosome
                            column 2: first base coordinate from the ref.
                            column 3: ref. base
                            column 4: consensus base
                            column 5: ???
                            column 6: mapping quality
                            column 7: ???
                            column 8: read depth
                            column 9: read base column
                            column 10: ???

                            Does somebody know which values are in column 5, 7 and 10? I could not find this information.
                            Last edited by suseq; 11-23-2009, 01:17 AM.

                            Comment




                            • and



                              the pileup command.

                              Comment


                              • I'm wondering for a genome assembly project, will duplicate removal (with samtolls rmdup) and flitering out low mapping quality (e.g mapQ <10) improve my assembly? What do people usually do after mapping with e.g bwa for QC purpose?

                                Another slightly different issue. Does it matter if 2 fatsq files have the exact identical headers (from 2 solexa runs)? How does samtools sort the bam file? Does it take the header IDs into consideration?

                                Thank you.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Essential Discoveries and Tools in Epitranscriptomics
                                  by seqadmin




                                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                                  04-22-2024, 07:01 AM
                                • seqadmin
                                  Current Approaches to Protein Sequencing
                                  by seqadmin


                                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                  04-04-2024, 04:25 PM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, Today, 08:47 AM
                                0 responses
                                12 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-11-2024, 12:08 PM
                                0 responses
                                60 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 10:19 PM
                                0 responses
                                59 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 09:21 AM
                                0 responses
                                54 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X