SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
FastQC per base sequence content analyst Bioinformatics 14 02-15-2017 07:25 AM
FastQC Problem polsum Bioinformatics 8 11-04-2016 09:07 AM
fastqc read limit? dejavu2010 Bioinformatics 5 04-13-2012 12:23 AM
raw sequence short read data sweet_dna_girl Bioinformatics 4 02-15-2012 11:42 PM
fastqc sequence duplication level fadista Bioinformatics 4 01-11-2012 10:17 AM

Reply
 
Thread Tools
Old 06-19-2012, 06:16 AM   #1
turnersd
Senior Member
 
Location: Charlottesville, VA

Join Date: May 2011
Posts: 112
Default FASTQC problem on read with no sequence data

I've had a problem with FASTQC. I get the following error:

Code:
Failed to process file 0618107SM.fastq
uk.ac.bbsrc.babraham.FastQC.Sequence.SequenceFormatException: Midline 'CAAACATACAGCTTAAAAC
AACAGACATTTATTATCTTATGGT' didn't start with '+'
When I look at the context around CAACATACA..... I see that there is a sequence identifier but no sequence info following it.

Code:
@HWUSI-EAS1758R:33:64PA7AAXX:4:1:6103:1039 1:N:0:
CTCGATCCACAAACCGCCCTTGGGGTAAACATTCGG
+
IIIIIIIIIIIIIIIIHIIIIIIIIEHIGHHHHIII
@:701;5677@<5@######################
@HWUSI-EAS1758R:33:64PA7AAXX:4:1:6397:1046 1:N:0:
CCTTAGGTTATTTCATGCCTAGAAATGTATCCTACA
+
HHGHHHGHHDHHHHHH@HHHHHHHHHHGHHHHGHHG
@HWUSI-EAS1758R:33:64PA7AAXX:4:1:6580:1044 1:N:0:
GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGAAA
+
GGGGEGDDGBHGHHHEGDGEGGD@GBB???GBGDBF
It looks like the identifier that starts as "@:701;5677" causes a failure at the next read. How would I get rid of these "empty" reads?
turnersd is offline   Reply With Quote
Old 06-19-2012, 06:32 AM   #2
dariober
Senior Member
 
Location: Cambridge, UK

Join Date: May 2010
Posts: 311
Default

Quote:
Originally Posted by turnersd View Post
How would I get rid of these "empty" reads?
I think the problem is how you got these emtpy reads in the first place!? It looks like something went wrong with some processing upstream. Are you looking at a fastq file straight from the sequencing facility?

Dario
dariober is offline   Reply With Quote
Old 06-19-2012, 10:00 AM   #3
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

Quote:
Originally Posted by turnersd View Post
I've had a problem with FASTQC.
No, you've got a problem in your FASTQ file

As Dario says, you'll need to track back and work out what went wrong in the creation of this file. It could be as simple as a data corruption on disk or when copying the file.
maubp is offline   Reply With Quote
Old 06-20-2012, 03:44 PM   #4
turnersd
Senior Member
 
Location: Charlottesville, VA

Join Date: May 2011
Posts: 112
Default

Thanks, yes it's definitely a problem with the FASTQ file. It looks like whoever filtered this data before I got my hands on it was using grep to looks for flags in the identifier and pulls out 4 lines of context around it. But obviously something went very wrong. Now, I just need to hunt down the raw data. Thanks.
turnersd is offline   Reply With Quote
Reply

Tags
fastq, fastqc

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:06 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO