Seqanswers Leaderboard Ad

**gringer** · 01-12-2014, 07:57 PM

I've filed a bug for this in github:

Samtools mpileup doesn't report all bases when reference fasta file is specified · Issue #112 · samtools/samtools

https://github.com/samtools/samtools/issues/112

Reporting on behalf of seqanswers user xmubingo, using VCF utils to generate a consensus assembly doesn't work properly because mpileup is not reporting all bases. The generated sequence from mpile...

In the meantime, you'll either have to use my script with the reference fasta file (in which case INDELs will change the length of the generated sequence), or add a dummy SAM line for each sequence in the reference fasta file that covers the entire range of the sequence.

**xmubingo** · 01-13-2014, 11:16 PM

Originally posted by gringer View Post

I've filed a bug for this in github:

Samtools mpileup doesn't report all bases when reference fasta file is specified · Issue #112 · samtools/samtools

https://github.com/samtools/samtools/issues/112

Reporting on behalf of seqanswers user xmubingo, using VCF utils to generate a consensus assembly doesn't work properly because mpileup is not reporting all bases. The generated sequence from mpile...

In the meantime, you'll either have to use my script with the reference fasta file (in which case INDELs will change the length of the generated sequence), or add a dummy SAM line for each sequence in the reference fasta file that covers the entire range of the sequence.

Hi gringer, Thanks a lot!! I will try your code.

**xmubingo** · 01-15-2014, 11:53 PM

Originally posted by gringer View Post

I've filed a bug for this in github:

Samtools mpileup doesn't report all bases when reference fasta file is specified · Issue #112 · samtools/samtools

https://github.com/samtools/samtools/issues/112

Reporting on behalf of seqanswers user xmubingo, using VCF utils to generate a consensus assembly doesn't work properly because mpileup is not reporting all bases. The generated sequence from mpile...

In the meantime, you'll either have to use my script with the reference fasta file (in which case INDELs will change the length of the generated sequence), or add a dummy SAM line for each sequence in the reference fasta file that covers the entire range of the sequence.

Hi gringer, i think find a way to solve this problem. Although the consensus sequence(cns.fa -> cns.fa) generated by vcf2fq doesn't have same length as it in reference sequence. but i find the sequence in cns.fa just lacks of some 'n' in its tail. So, we can add some 'n' to the sequences to make its length equal to its original length. for example:

cns.fa

Code:

>seq1
nnnnnnnnnnnnnnnAAAATTTTTCCCCGGGGgggccccGGTTTg

cns.fa.fixed

Code:

>seq1
nnnnnnnnnnnnnnnAAAATTTTTCCCCGGGGgggccccGGTTTgnnnnnnnnnnnnn

**jiewencai** · 12-09-2014, 07:03 PM

It's amazing

Originally posted by gringer View Post

I have a slightly different processing script, but the command you have seems to work for me on my mitochondrial data (i.e. it produces a fastq file at the end of it).

I've attached my modified vcf2fq script to this post [also the fairly trivial fastq2fasta], maybe it will help to diagnose the problem.

The original vcf2fq can't producing sequence containing indels, and it's really bother me a lot,but this one can produce sequence containing indels,that's amazing.

**Yuhuan Meng** · 11-17-2015, 05:43 PM

I have a problem, if the new genome.fa contain indels, the position would be changed, if I call the genes with gtf from new genome, the called genes will not with the same regions compared with original genes in the original genome.

**gringer** · 11-18-2015, 01:03 AM

... so don't use a non-reference genome for defining the location of variants. Always link back to the reference genome when defining positions.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 24 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 25 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 21 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News