Seqanswers Leaderboard Ad

**TiborNagy** · 10-25-2012, 01:24 AM

awk '{if(NR%4==2){print substr($0,5,length($0))}else{print}}' file.fastq

**maasha** · 10-25-2012, 01:43 AM

Using Biopieces:

Code:

read_fastq -i in.fq | extract_seq -b 4 | write_fastq -o out.fq -x

**maubp** · 10-25-2012, 02:30 AM

You should probably learn to program - e.g. Perl, Python, Ruby - whatever your local gurus use would be sensible as you'd have someone nearby to help.

Here's a high-level Biopython solution:

Code:

from Bio import SeqIO
records = (rec[4:] for rec in SeqIO.parse("input.fastq", "fastq"))
count = SeqIO.write(records, "output.fastq", "fastq")
print "Trimmed %i FASTQ records" % count

That uses lots of objects and would be a bit slow on large files, but it is quite simple and could be used on many other supported file formats. See http://news.open-bio.org/news/2009/0...on-fast-fastq/ which would suggest something like this using Python strings (much faster but FASTQ specific):

Code:

from Bio.SeqIO.QualityIO import FastqGeneralIterator
handle = open("output.fastq", "w")
for title, seq, qual in FastqGeneralIterator(open("input.fastq")) :
    handle.write("@%s\n%s\n+\n%s\n" % (title, seq[4:], qual[4:]))
handle.close()

Similarly if you want to learn Perl or Ruby or Java, there are FASTQ modules in BioPerl, BioRuby and BioJava. See http://dx.doi.org/10.1093/nar/gkp1137

**lisann_5** · 10-25-2012, 04:24 AM

Thanks!

Thank you all for the replay. I found my solution for this problem by maasha!

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Yesterday, 11:49 AM	0 responses 15 views 0 likes	Last Post by seqadmin Yesterday, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 61 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

manipulate sequences in Fastq files

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News