Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • samtools pileup for multiple diploid individuals?

    Is it appropriate to use samtools pileup (which uses maq's consensus- and SNP-calling model) on pooled reads from multiple diploid individuals? I'm looking for SNPs both within the read population, and between the reads and the reference. I've got 6 individuals in separate samples from a species close to the reference, at low enough depth where I should probably just forget about mapping each individual separately (in other words, I can forget about calling a genotype for each individual). But I'd still like to call the most likely consensus and SNP for this population ...

    Comment


    • Does any one know about any tool/program to convert SAM output to BED format, if so please let me know

      Comment


      • Originally posted by zee View Post
        Is there a way to convert a SAM consensus output (using -c option for pileup) to the old maq-style .cns consensus?

        I have some maq-based pipelines I would like to use on my BWA results.
        Has anyone had any luck with this? In addition, in MAQ, you can dump unmapped reads into a separate file, is there such a function/tool in samtools? Thank you.

        Comment


        • bekkari:

          I believe ConvertToBed.jar in the Vancouver Short Read Analysis Package can do it.

          Anthony
          The more you know, the more you know you don't know. —Aristotle

          Comment


          • if you use BOWTIE as an alignment algorithm, there is an option (--un) to dump all unmapped reads into a file.

            Comment


            • Originally posted by bekkari View Post
              if you use BOWTIE as an alignment algorithm, there is an option (--un) to dump all unmapped reads into a file.

              But I'm using BWA. Is there an unmapped flag in the sam file?

              Comment


              • Partial pileup for samtools?

                samtools pileup takes ages, so does varfilter.

                Can samtools pileup work on one chromosome? It would be
                must easier for parallelization.

                Comment


                • Comment


                  • properly mapped Flag

                    Perhaps I have misunderstood it but isn't right that properly mapped flag (P string) are only used when read pairs are mapped to the same chromosome with correct insert size? I have about 6% of the properly mapped reads with the P string flag that have mate mapped to different chromosome with 0 insert size as shown below. Has anyone seen this before?

                    EBRI093151:1:90:555:299#0 pPR1 Chr11 107308221 23 36M Chr12 49 0 TATCCTATTCGAAAGTCGCCATGACCGTGGACATGA BCCBCBBACCB?CBA@BBACCBCAAB<6<?BABBBB XT:A:U NM:i:0 SM:i:23 AM:i:23 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:36

                    EBRI093151:1:90:555:299#0 pPr2 Chr12 49 23 11S7M1D12M6S Chr11 107308221 0 CTACCGCTTGGGTGGTCATGAATGATTAGCACGCCC AB99@B=BBA>ACCBCCBBCBBCBBBBCBCBB@B@A XT:A:M NM:i:4 SM:i:23 AM:i:23 XM:i:3 XO:i:1 XG:i:1 MD:Z:3A3^T2T5C3

                    Comment


                    • Try the latest version of bwa.

                      Comment


                      • Originally posted by lh3 View Post
                        Try the latest version of bwa.
                        Heng,

                        This was done with BWA 0.5.4.

                        For resequencing project, does it really matter if the mates are not properly mapped? Can I instead just filter out those reads with low mapping qualities?

                        Thank you.

                        Comment


                        • Originally posted by zlu View Post
                          Perhaps I have misunderstood it but isn't right that properly mapped flag (P string) are only used when read pairs are mapped to the same chromosome with correct insert size? I have about 6% of the properly mapped reads with the P string flag that have mate mapped to different chromosome with 0 insert size as shown below. Has anyone seen this before?

                          EBRI093151:1:90:555:299#0 pPR1 Chr11 107308221 23 36M Chr12 49 0 TATCCTATTCGAAAGTCGCCATGACCGTGGACATGA BCCBCBBACCB?CBA@BBACCBCAAB<6<?BABBBB XT:A:U NM:i:0 SM:i:23 AM:i:23 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:36

                          EBRI093151:1:90:555:299#0 pPr2 Chr12 49 23 11S7M1D12M6S Chr11 107308221 0 CTACCGCTTGGGTGGTCATGAATGATTAGCACGCCC AB99@B=BBA>ACCBCCBBCBBCBBBBCBCBB@B@A XT:A:M NM:i:4 SM:i:23 AM:i:23 XM:i:3 XO:i:1 XG:i:1 MD:Z:3A3^T2T5C3
                          I don't see anywhere in the specification how to set the "properly paired" bit. I would guess this is aligner dependent.

                          Comment


                          • varFilter out put

                            Hi,

                            I have used samtools to analyse variations using varFilter. So I have imported an alignment file from BWA in sam format, have sorted and run:
                            1. samtools pileup -vcf ...
                            2. samtools.pl varFilter...| awk '$6>=20' ...

                            It did run but I have problems to interpret all the columns. What I think is:
                            column 1: chromosome
                            column 2: first base coordinate from the ref.
                            column 3: ref. base
                            column 4: consensus base
                            column 5: ???
                            column 6: mapping quality
                            column 7: ???
                            column 8: read depth
                            column 9: read base column
                            column 10: ???

                            Does somebody know which values are in column 5, 7 and 10? I could not find this information.
                            Last edited by suseq; 11-23-2009, 01:17 AM.

                            Comment




                            • and



                              the pileup command.

                              Comment


                              • I'm wondering for a genome assembly project, will duplicate removal (with samtolls rmdup) and flitering out low mapping quality (e.g mapQ <10) improve my assembly? What do people usually do after mapping with e.g bwa for QC purpose?

                                Another slightly different issue. Does it matter if 2 fatsq files have the exact identical headers (from 2 solexa runs)? How does samtools sort the bam file? Does it take the header IDs into consideration?

                                Thank you.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Advancing Precision Medicine for Rare Diseases in Children
                                  by seqadmin




                                  Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                                  12-16-2024, 07:57 AM
                                • seqadmin
                                  Recent Advances in Sequencing Technologies
                                  by seqadmin



                                  Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                                  Long-Read Sequencing
                                  Long-read sequencing has seen remarkable advancements,...
                                  12-02-2024, 01:49 PM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 12-17-2024, 10:28 AM
                                0 responses
                                33 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 12-13-2024, 08:24 AM
                                0 responses
                                49 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 12-12-2024, 07:41 AM
                                0 responses
                                34 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 12-11-2024, 07:45 AM
                                0 responses
                                46 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X