Hi,
I'm currently trying to get a consensus sequence that is specific to individual humans, from the 1000 genomes project data available in BAM format at ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/
I want to do multiple sequence alignments of certain loci between individuals, so I need the individual sequence in FASTA format. Currently, I've tried to do samtools pileup -c, followed by conversion to fastq (samtools) and then to fasta (maq tools):
samtools pileup -c -f /gcm/opt/genomes/hg18-chromosomes.fa ${file}.bam > ${file}.pileup
/opt/samtools/samtools-0.1.7a/misc/samtools.pl pileup2fq ${file}.pileup > ${file}.fastq
/opt/maq/maq-0.7.1/bin/fq_all2std.pl fq2fa ${file}.fastq > ${file}.fasta
The FASTQ generated from consensus pileup seems ok, although the whole sequence is on a single line. Using MAQ's fq2fa, however, this is converted into a much smaller FASTA file, with quality score data instead of sequence in there.
So, any ideas what I may be doing wrong or alternative tool suggestions to accomplish this task, in case the converters are buggy, would be much appreciated.
Thanks!
I'm currently trying to get a consensus sequence that is specific to individual humans, from the 1000 genomes project data available in BAM format at ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/
I want to do multiple sequence alignments of certain loci between individuals, so I need the individual sequence in FASTA format. Currently, I've tried to do samtools pileup -c, followed by conversion to fastq (samtools) and then to fasta (maq tools):
samtools pileup -c -f /gcm/opt/genomes/hg18-chromosomes.fa ${file}.bam > ${file}.pileup
/opt/samtools/samtools-0.1.7a/misc/samtools.pl pileup2fq ${file}.pileup > ${file}.fastq
/opt/maq/maq-0.7.1/bin/fq_all2std.pl fq2fa ${file}.fastq > ${file}.fasta
The FASTQ generated from consensus pileup seems ok, although the whole sequence is on a single line. Using MAQ's fq2fa, however, this is converted into a much smaller FASTA file, with quality score data instead of sequence in there.
So, any ideas what I may be doing wrong or alternative tool suggestions to accomplish this task, in case the converters are buggy, would be much appreciated.
Thanks!
Comment