Seqanswers Leaderboard Ad

**maubp** · 12-31-2012, 02:48 PM

Well, what was line 5 of the file? Perhaps you could show us the output of the command 'head -n 10 example.fastq' or similar? Use the [ code ] and [ /code ] tags to ensure the forum displays the output nicely (available as a button in the advanced view editor).

**tonybert** · 12-31-2012, 04:00 PM

Below is your requested output (maubp):
head -n 10 shuffled.fastq
@HWI-ST700693:263:C0K6DACXX:3:1101:2281:2077
GATCGGAAGAGCACACGTCTGAACTCCAGTCACGATCAGATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAACACG
@HWI-ST700693:263:C0K6DACXX:3:1101:2281:2077
GGTTTCGAAAAGAGGGGGGGGGGGGGAGAGGGGGGGAAACCGGTGGGGCCCCCCCCCAANAAAAAAAAAAAAAAAA
+
@@@DDDDDHFFDHGIID<C?<BHG<GGGDCBBHB;?FGHI9??BBFH@>GCF;A.-==@C@C36@;@D########
+
############################################################################
@HWI-ST700693:263:C0K6DACXX:3:1101:2339:2112
CATGTAGTGAACCATATGCTCCAGTAATACCTTGAACAATGACTCCTTTATTTTCATAATCAGAATCCTCTGGTTT

**maubp** · 12-31-2012, 04:26 PM

That FASTQ file is certainly messed up. My guess is you used a FASTA interleaving script which assumed 2 lines per record... while FASTQ files usually have 4 lines per record.

Which script exactly did you use? There is no shuffleSequences.pl script on Nick's blog post - it just mentions using Velvet’s bundled Perl script of that name.

**tonybert** · 12-31-2012, 04:32 PM

Thanks for the prompt reply! Below is the script I used:

$ cat shuffleSequences.pl
#!/usr/bin/perl

$filenameA = $ARGV[0];
$filenameB = $ARGV[1];
$filenameOut = $ARGV[2];

open $FILEA, "< $filenameA";
open $FILEB, "< $filenameB";

open $OUTFILE, "> $filenameOut";

while(<$FILEA>) {
print $OUTFILE $_;
$_ = <$FILEA>;
print $OUTFILE $_;

$_ = <$FILEB>;
print $OUTFILE $_;
$_ = <$FILEB>;
print $OUTFILE $_;
}

**tonybert** · 12-31-2012, 04:33 PM

as well, this script was not actually on Nick Loman's blog, however it was mentioned in the text. I copied it from following website:

Google Code Archive - Long-term storage for Google Code Project Hosting.

http://code.google.com/p/velvet-research/source/browse/trunk/shuffleSequences.pl

**maubp** · 12-31-2012, 04:42 PM

I really can't recommend running random Perl scripts found online like that - it doesn't even have a comment at the start telling you what it should be doing. However, from my limited Perl knowledge, I think it is doing a very simple interleaving process assuming 2 lines per record, which would be OK for short read FASTA files with no line wrapping, but it does absolutely no error checking - thus it mangled your data without warning.

If you look at the actual Velvet repository, it has some more clearly labelled Perl scripts, with a version for FASTA and another for FASTQ:

velvet/contrib/shuffleSequences_fasta at master · dzerbino/velvet

https://github.com/dzerbino/velvet/tree/master/contrib/shuffleSequences_fasta

Short read de novo assembler using de Bruijn graphs, as published in: D.R. Zerbino and E. Birney. 2008. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Research, 1...

(They still need a bit of documentation, and in my personal view, error handling)

Topics	Statistics	Last Post
Evaluating Genome Sequencing for ECMO Patients in the NICU by seqadmin Started by seqadmin, 12-17-2024, 10:28 AM	0 responses 23 views 0 likes	Last Post by seqadmin 12-17-2024, 10:28 AM
New Genetic Toolkit Refines Studies on Gene Function and Disease by seqadmin Started by seqadmin, 12-13-2024, 08:24 AM	0 responses 42 views 0 likes	Last Post by seqadmin 12-13-2024, 08:24 AM
Study Links Brain Mechanism to Emotional Responses in Animals and Humans by seqadmin Started by seqadmin, 12-12-2024, 07:41 AM	0 responses 28 views 0 likes	Last Post by seqadmin 12-12-2024, 07:41 AM
Study Identifies Ribosomal RNA Fingerprints as Early Cancer Biomarkers by seqadmin Started by seqadmin, 12-11-2024, 07:45 AM	0 responses 42 views 0 likes	Last Post by seqadmin 12-11-2024, 07:45 AM

Seqanswers Leaderboard Ad

Announcement

fastx_quality_stats error with paired end sequencesr

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News