Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Find SNPs in related strains

    Hello everyone,

    I am extremely new at bioinformatics, genome sequencing and working with the output data, so please excuse any naive questions (I also just leanred working in Linux for samtools/bcftools).
    Our lab has recently sequenced the genome of a laboratory strain from which the type strain genome is known. The genome was sequenced using illumina and output was already processed for us using the DRAGEN pipeline.
    I have received all output from the sequencing, including .bam and .vcf files. I am starting to figure out what these files are, what kind of information they contain and how to work with them (yes, I am still at this level, sorry )

    Our end goal here is to first of all have a complete consensus sequence of the genome of our lab strain. Secondly, we would like to identify SNPs and identify their position compared to the annotated genome of our reference strain.

    I have already been able to use IGV, input the genome of our reference strain and import the vcf file to find the SNPs. I know there are 60 SNPs/indels. Is there some "easy" automated way to get a list of all variations without me having to scroll through the IGV and going over them one by one?
    I also tried using bcftools to get a consensus sequence using the a reference .fasta and the .bam file from the sequencing, but I get a sequence that is much smaller than my genome. I followed this guide: http://samtools.github.io/bcftools/h...-sequence.html

    Is there an easy basic guide that could first of all explain the file formats, where they come from and how they are connected to eachother? I think understanding this would get me started using samtools/bcftools more easily, since its tutorials assume knowledge about these things. Other nice information sources concerning my problems and goals are always welcome.

Latest Articles

Collapse

  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM
  • seqadmin
    Strategies for Sequencing Challenging Samples
    by seqadmin


    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
    03-22-2024, 06:39 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
18 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
22 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
16 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-04-2024, 09:00 AM
0 responses
47 views
0 likes
Last Post seqadmin  
Working...
X