![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Finding two strains of a genome having SNPs/SNVs in their repeat regions | smh.oloomi | Bioinformatics | 0 | 06-10-2017 10:04 PM |
Finding two strains of a genome having SNPs/SNVs in their repeat regions | smh.oloomi | Genomic Resequencing | 0 | 06-10-2017 09:52 PM |
Related SNPs to dbSNP | lyz1030 | Bioinformatics | 1 | 06-07-2012 03:57 AM |
Webinar on How to Find SNPs | Strand SI | Events / Conferences | 0 | 04-09-2012 04:38 AM |
find all snps/indels | prbndr | Bioinformatics | 2 | 09-20-2011 11:43 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: Belgium Join Date: Oct 2019
Posts: 1
|
![]()
Hello everyone,
I am extremely new at bioinformatics, genome sequencing and working with the output data, so please excuse any naive questions (I also just leanred working in Linux for samtools/bcftools). Our lab has recently sequenced the genome of a laboratory strain from which the type strain genome is known. The genome was sequenced using illumina and output was already processed for us using the DRAGEN pipeline. I have received all output from the sequencing, including .bam and .vcf files. I am starting to figure out what these files are, what kind of information they contain and how to work with them (yes, I am still at this level, sorry ![]() Our end goal here is to first of all have a complete consensus sequence of the genome of our lab strain. Secondly, we would like to identify SNPs and identify their position compared to the annotated genome of our reference strain. I have already been able to use IGV, input the genome of our reference strain and import the vcf file to find the SNPs. I know there are 60 SNPs/indels. Is there some "easy" automated way to get a list of all variations without me having to scroll through the IGV and going over them one by one? I also tried using bcftools to get a consensus sequence using the a reference .fasta and the .bam file from the sequencing, but I get a sequence that is much smaller than my genome. I followed this guide: http://samtools.github.io/bcftools/h...-sequence.html Is there an easy basic guide that could first of all explain the file formats, where they come from and how they are connected to eachother? I think understanding this would get me started using samtools/bcftools more easily, since its tutorials assume knowledge about these things. Other nice information sources concerning my problems and goals are always welcome. |
![]() |
![]() |
![]() |
Thread Tools | |
|
|