Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How do I go about converting a BCF file to a VCF file?

    I tried using BCFtools but I keep getting an error.



    1. I had 2 SAM files that converted to BAM, sorted by chromosome, and finally indexed using Picard.




    2. Using the 2 manipulated BAM files, I used the mpileup command in SAMtools. Here is the specific command:

    samtools mpileup -f ref.fa in.bam in2.bam > in_in2_pileup.bcf




    3. After this step I wanted to get the vcf format. I used bcftools and the command view in order to do this:

    bcftools view in_in2_pileup.bcf > in_in2_pileup.vcf


    The error that I received after executing this command:

    incorrect number of fields (0 != 5) at 0:0






    PS. I tried looking my problem up in google search and every example seemed to be irrelevant in my case.

  • #2
    Every example? How about http://samtools.sourceforge.net/mpileup.shtml. One of my first hits via Google.

    However what you are probably overlooking is the proper command line option to mpileup. Either look at the above web page and/or type in 'samtools mpileup' and read the help or look at your output file. It will be good for your eyes to spot the mistake. :-)

    Comment


    • #3
      Ahh I see i see, thanks.

      Comment


      • #4
        I have the same problem but I have not "seen" yet ...

        i want to use mpileup without -u and -g options but all the command line used did not work..

        Comment


        • #5
          Can you share us the command line?

          Comment


          • #6
            Code:
            samtools mpileup -f .fas .sorted.bam > .bcf | bcftools view > .vcf&

            Just now I have used

            Code:
            samtools mpileup -f .fas .sorted.bam | tee teeoutput.bcf&
            This "artefact" maybe works..

            Comment


            • #7
              I just replied on biostars, but this won't work since bcftools is expecting BCF, and you're giving it mpileup, which is text.

              Comment


              • #8
                So, are you telling me that the output from
                Code:
                samtools mpileup -f .fas .sorted.bam
                is a .txt?

                .. then if I use..

                Code:
                samtools mpileup -f .fas .sorted.bam | tee teeoutput.txt&
                ...I ll reach the table that I pine for!!

                Comment


                • #9
                  Originally posted by Giffredo View Post
                  .. then if I use..

                  Code:
                  samtools mpileup -f .fas .sorted.bam | tee teeoutput.txt&
                  ...I ll reach the table that I pine for!!
                  Well, it depends on what sort of table you're after. If you just want the pileup and not variant calls then:

                  Code:
                  samtools mpileup -f .fas .sorted.bam > output.txt
                  would seem to do what you want. There'd be no need to pipe things to tee in that case. Also, there's no reason to always end commands with "&", particularly if they'll just be printing stuff to the screen.

                  Comment


                  • #10
                    Ok thanks!!! I will try.
                    About & I know.. I put it because of force of habit.. ...

                    Comment


                    • #11
                      chrM 136 A 6 ,,,,,, 896774
                      chrM 137 A 6 ,,,,,, ?=@===
                      chrM 138 A 6 ,,,,,, ?=<=8<
                      chrM 139 C 6 ,,,,,, 887801
                      chrM 140 C 6 ,,,,,, @>==9;
                      chrM 141 C 6 ,,,,,, ><==9;
                      chrM 142 T 6 ,,,,,, 747701

                      the manual say: each line represents a genomic position, consisting of chromosome name, coordinate, reference base, read bases, read qualities and alignment mapping qualities. Information on match, mismatch, indel, strand, mapping quality and start and end of a read are all encoded at the read base column.

                      it is not like expected... this txt is not useful at all.

                      Comment


                      • #12
                        That text file is exactly the output that the manual is describing, so I'm not sure what else you were expecting.

                        Comment


                        • #13
                          I think it is not...
                          I expected something like this but with the right symbol in the right positions in order to have the possibility to translate and understand them!
                          where are the read bases? and the quality value? why so many ',' if the position is one?
                          and..
                          what mean @ it is not in the legend.. for me there is characters mismatch because of file wrong conversion...

                          Comes on the question: using pmileup is possible to reach other types of results different from these i reached so far?

                          Comment


                          • #14
                            You need to read the manual a bit more. "," is a base call, it just means "the same as the reference, on the reverse orientation". The base quality scores are in the last column, they're phred encoded. "@", for example, is 31.

                            Regarding mpileup vs. pileup, the output is the same. The only difference is that mpileup can deal with multiple files at once (it just tacks on an extra 3 columns per file).

                            Perhaps it would help if you mentioned what your actual goal is.

                            Comment


                            • #15
                              "," is a base call, it just means "the same as the reference, on the reverse orientation". The base quality scores are in the last column, they're phred encoded. "@", for example, is 31.
                              OK.. but I have still some doubts: I know "," is a base call and for this reason I don t understand why I have more than one "," for only one base in one position...

                              ..and what is Phred code? I know the ASCII code but Phred for me is only a number derived from a logarithmic operation...

                              My goal is measure the editing sites position, splicing, post transcription modifications in general from my mutant sample mRNA. And I want to make a statistical work on these variations.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Essential Discoveries and Tools in Epitranscriptomics
                                by seqadmin



                                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified...
                                Yesterday, 07:01 AM
                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              55 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              52 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              45 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-04-2024, 09:00 AM
                              0 responses
                              55 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X