Go Back   SEQanswers > Applications Forums > Genomic Resequencing

Similar Threads
Thread Thread Starter Forum Replies Last Post
Finding two strains of a genome having SNPs/SNVs in their repeat regions smh.oloomi Bioinformatics 0 06-10-2017 10:04 PM
Finding two strains of a genome having SNPs/SNVs in their repeat regions smh.oloomi Genomic Resequencing 0 06-10-2017 09:52 PM
Related SNPs to dbSNP lyz1030 Bioinformatics 1 06-07-2012 03:57 AM
Webinar on How to Find SNPs Strand SI Events / Conferences 0 04-09-2012 04:38 AM
find all snps/indels prbndr Bioinformatics 2 09-20-2011 11:43 AM

Thread Tools
Old 10-09-2019, 09:45 AM   #1
Junior Member
Location: Belgium

Join Date: Oct 2019
Posts: 1
Default Find SNPs in related strains

Hello everyone,

I am extremely new at bioinformatics, genome sequencing and working with the output data, so please excuse any naive questions (I also just leanred working in Linux for samtools/bcftools).
Our lab has recently sequenced the genome of a laboratory strain from which the type strain genome is known. The genome was sequenced using illumina and output was already processed for us using the DRAGEN pipeline.
I have received all output from the sequencing, including .bam and .vcf files. I am starting to figure out what these files are, what kind of information they contain and how to work with them (yes, I am still at this level, sorry )

Our end goal here is to first of all have a complete consensus sequence of the genome of our lab strain. Secondly, we would like to identify SNPs and identify their position compared to the annotated genome of our reference strain.

I have already been able to use IGV, input the genome of our reference strain and import the vcf file to find the SNPs. I know there are 60 SNPs/indels. Is there some "easy" automated way to get a list of all variations without me having to scroll through the IGV and going over them one by one?
I also tried using bcftools to get a consensus sequence using the a reference .fasta and the .bam file from the sequencing, but I get a sequence that is much smaller than my genome. I followed this guide:

Is there an easy basic guide that could first of all explain the file formats, where they come from and how they are connected to eachother? I think understanding this would get me started using samtools/bcftools more easily, since its tutorials assume knowledge about these things. Other nice information sources concerning my problems and goals are always welcome.
dsybers is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 06:29 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO