Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Varscan header doesn't match the number of columns in the data

    I'm using Varscan 2.3.3 with the somatic sub program and noticed the column names (there are 19 based on the output file) but there are 23 columns worth of data below the header. I looked at the documentation and it also shows 19 columns. Can anyone shed some light on what the 4 additional columns at the end of the file are:

    $ head -2 1.snp
    chrom position ref var normal_reads1 normal_reads2 normal_var_freq normal_gt tumor_reads1 tumor_reads2 tumor_var_freq tumor_gt somatic_status variant_p_value
    somatic_p_value tumor_reads1_plus tumor_reads1_minus tumor_reads2_plus tumor_reads2_minus
    chr1 10469 C G 43 2 4.44% C 36 11 23.4% S Somatic 1.0 0.008838714792838208 20 16 3 8 32 11 0 2

    $ awk -F '\t' '{print NF}' 1.snp | less
    19
    23
    23
    23
    23
    23

  • #2
    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


    I think this thread concerns the same question
    //A

    Comment


    • #3
      Thanks, didnt catch the thread during my search. Much appreciated!

      Comment


      • #4
        Thanks for the question, rdeboja, and to adrianl for the answer. I will make sure the output is properly headered in the next release of VarScan!

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin


          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
          Yesterday, 07:01 AM
        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        55 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        51 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        45 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        55 views
        0 likes
        Last Post seqadmin  
        Working...
        X