View Single Post
Old 10-03-2015, 10:04 AM   #5
gauravdube
Junior Member
 
Location: India

Join Date: Feb 2014
Posts: 7
Default

Hi av_d,

My fastq file consists of lot non-ATGC characters (you are too are getting in your file, you see that 'W' in your fastq ?). What are these characters and how to handle these?

Commands used:
bwa index ref.fa
bwa aln -t 9 cocsa_ref.fa D2_R2.fastq -f D2_R2.sai && bwa aln -t 9 cocsa_ref.fa D2_R1.fastq -f D2_R1.sai
bwa sampe ref.fa D2_R1.sai D2_R2.sai D2_R1.fq D2_R2.fq > D2-aln-pe2.sam
samtools faidx cocsa_ref.fa
samtools view -bt ref.fa.fai D2-aln-pe2.sam > D2-aln-pe2.bam
samtools sort D2-aln-pe2.bam D2-aln-pe2.bam.srt
samtools index D2-aln-pe2.bam.srt.bam
samtools mpileup -uf ref.fa D2-aln-pe2.bam.srt.bam | bcftools view -cg - | vcfutils.pl vcf2fq > CONSENSUS.fq


CONSENSUS.fq file looks like:
@scaffold_1
nnngtttggtggtagtattggtatttcaaacacgctaggtgtttgttggttttgagtagg
tgtagctggagtagactctatctccatttctctatcagtttgggcctctggccctaggct
ctcctgtctgttttcttgagtatttactacaatagtatcactgtctggcggcattttatt
actaagctcttttcttagtaagcaactagatggtctgtgtgtttttgttttcgtgagtga
gacgtgttcagattagctactttaccagcttctagctctatagcgcgtgggctgcacgag
ttggcactagttgtaatcgatttcttgggatggatttgtatataattcgctaaaattaca
cctattctgaaaaactcgnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
nnnnnnnnTAATGTTACAAGTAAYAAGAAGGATYCTYTCCTTRACAAATRACGAGATGGC

Please also convey, how to handle the small-case characters and 'N's ?

Thanks in advance.
gauravdube is offline   Reply With Quote