Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • absolute k-mer coverage explained (Abyss)

    The (very limited) Abyss manual explains the fasta headers in the contigs assembly outfile like this:


    >n iii jjjj
    Where n is the numeric contig ID, iii is the contig length in nucleotides, and jjjj is the absolute k-mer coverage.

    I don't understand what "absolute k-mer coverage" means. Can someone maybe explain? But what I would really like to know is which basepair in a contig are represented by how many reads. IS there a way to get this from the k-mer coverage?

    THANKS

  • #2
    Hi harrb,

    have a look at this older thread about the header;
    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


    About the coverage (per position on the contig i think you mean?) you can probably run ABYSS with the option

    --coverage-hist=FILE

    This option (as stated at http://seqanswers.com/wiki/ABySS) records the k-mer coverage histogram in FILE. Though, I'm not 100% sure, since i can not test it @tm .

    Otherwise, you can map your reads to the contigs (Bowtie, BWA), producing a .sam file. Convert the .sam file to .bam file and do;

    samtools pileup -f your_ref.fa your.bam

    Have a look at this thread;

    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


    Good luck,
    Boetsie

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Essential Discoveries and Tools in Epitranscriptomics
      by seqadmin


      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
      Yesterday, 07:01 AM
    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    39 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    41 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    35 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-04-2024, 09:00 AM
    0 responses
    55 views
    0 likes
    Last Post seqadmin  
    Working...
    X