SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
tophat2 splice sequence has more than 2^32-1 characters caddymob Bioinformatics 1 07-23-2012 10:40 AM
"?" characters from HiSeq FastQ affecting Mira: HELP!!! alanmcn1 Bioinformatics 2 11-01-2011 05:38 AM
Odd characters in samtools mpileup output Bueller_007 Bioinformatics 0 08-26-2011 04:33 PM
Lower case characters in FASTa reference sequence foxyg Bioinformatics 5 09-08-2010 01:08 PM
Bowtie warning: skip solexa** because it is less than 4 characters. pythonlovesbowtie Bioinformatics 0 11-17-2009 11:11 PM

Reply
 
Thread Tools
Old 10-03-2015, 09:11 AM   #1
gauravdube
Junior Member
 
Location: India

Join Date: Feb 2014
Posts: 7
Default Non-ATGC characters, small-case characters and lots of 'N's in fastq file's sequence

Hi All,

My fastq file consists of lot non-ATGC characters. What are these characters and how to handle these?

Commands used:
bwa index ref.fa
bwa aln -t 9 ref.fa D2_R2.fastq -f D2_R2.sai && bwa aln -t 9 ref.fa D2_R1.fastq -f D2_R1.sai
bwa sampe ref.fa D2_R1.sai D2_R2.sai D2_R1.fq D2_R2.fq > D2-aln-pe2.sam
samtools faidx ref.fa
samtools view -bt ref.fa.fai D2-aln-pe2.sam > D2-aln-pe2.bam
samtools sort D2-aln-pe2.bam D2-aln-pe2.bam.srt
samtools index D2-aln-pe2.bam.srt.bam
samtools mpileup -uf ref.fa D2-aln-pe2.bam.srt.bam | bcftools view -cg - | vcfutils.pl vcf2fq > CONSENSUS.fq


CONSENSUS.fq file looks like:
@scaffold_1
nnngtttggtggtagtattggtatttcaaacacgctaggtgtttgttggttttgagtagg
tgtagctggagtagactctatctccatttctctatcagtttgggcctctggccctaggct
ctcctgtctgttttcttgagtatttactacaatagtatcactgtctggcggcattttatt
actaagctcttttcttagtaagcaactagatggtctgtgtgtttttgttttcgtgagtga
gacgtgttcagattagctactttaccagcttctagctctatagcgcgtgggctgcacgag
ttggcactagttgtaatcgatttcttgggatggatttgtatataattcgctaaaattaca
cctattctgaaaaactcgnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
nnnnnnnnTAATGTTACAAGTAAYAAGAAGGATYCTYTCCTTRACAAATRACGAGATGGC

Please also convey, how to handle the small-case characters and 'N's ?

Thanks in advance.
gauravdube is offline   Reply With Quote
Old 10-03-2015, 09:53 AM   #2
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Cross-posted
Brian Bushnell is offline   Reply With Quote
Reply

Tags
'n's, fastq-file, non-atgc, small-case

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:56 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO