SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Extract a fasta sequence based on Id AND length Gabriel_ Bioinformatics 4 11-10-2015 12:20 PM
CIGAR and Sequence length incosistent bnfoguy Bioinformatics 8 02-12-2014 04:33 AM
how to get specific length sequence from a fasta file entomology Bioinformatics 5 07-12-2012 03:59 PM
Insert Sizes for Paired End Reads Exactly the same as Read Length rlowe Bioinformatics 0 06-27-2012 04:01 AM
filter sequence by length NicoBxl Bioinformatics 6 09-09-2011 11:00 AM

Reply
 
Thread Tools
Old 04-18-2017, 12:49 PM   #1
jetjr
Junior Member
 
Location: United States

Join Date: Apr 2017
Posts: 2
Default Two sizes in sequence length

I'm working with Illumina Miseq data. After BAM to fastq extraction I run fastqc to generate reports. With many of my samples I see two different sequence lengths as shown in the picture below:



Also want to add that if I see this in the sequence length distribution, it also means the sequence composition will be skewed:



Any idea what is causing this?
Single End Data
Adapters are removed off the sequencer

Thanks!
jetjr is offline   Reply With Quote
Old 04-19-2017, 05:46 AM   #2
jhalpin
Member
 
Location: Atlanta, GA

Join Date: Jan 2015
Posts: 15
Default

... you have single end Illumina data? Is that a thing?

We use Ion Torrent technology, but if I see that it's usually a size selection problem.. something went wrong with my bead ratios.
jhalpin is offline   Reply With Quote
Old 04-19-2017, 11:19 AM   #3
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,668
Default

This is odd. Illumina does not support read lengths greater than 300bp as far as I know (and even 300bp lengths are not well-supported), so I'm not sure what you are doing.

Can you explain the library-prep and sequencing methodology in more detail?
Brian Bushnell is offline   Reply With Quote
Old 04-19-2017, 11:37 AM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,435
Default

It is possible to run strange lengths e.g. 2 x 300 kit can be run as 600 single-end. Not sure why anyone would want to but one can.

@jetjr: Can you attach a complete fastqc analysis file?

Last edited by GenoMax; 04-19-2017 at 11:41 AM.
GenoMax is offline   Reply With Quote
Old 04-19-2017, 08:00 PM   #5
nucacidhunter
Senior Member
 
Location: Iran

Join Date: Jan 2013
Posts: 1,031
Default

In addition to questions in above posts, I wonder if these plots are from the same sample or file. Sequence length distribution plot shows a flat 0 count on Y axis for sequence length over ~275 while Per base sequence plot indicates that read length are up to 370 base.
nucacidhunter is offline   Reply With Quote
Old 04-20-2017, 04:51 AM   #6
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,218
Default

Probably from merged reads from a 500 cycle run. (The plots are from fastqc, not SAV.) The shorter peak from reads that didn't merge.

BTW, can I add that fastqc's variable width bins make me feel uneasy?
--
Phillip
pmiguel is offline   Reply With Quote
Old 04-20-2017, 05:59 AM   #7
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,435
Default

Quote:
Originally Posted by pmiguel View Post
BTW, can I add that fastqc's variable width bins make me feel uneasy?
--
Phillip
Easily fixed by adding --nogroup to command.
GenoMax is offline   Reply With Quote
Old 04-21-2017, 04:15 PM   #8
jetjr
Junior Member
 
Location: United States

Join Date: Apr 2017
Posts: 2
Default

Thanks all for the replies! I stand corrected it is Ion Torrent data >.< We are looking into some of the things mentioned in this thread. I will add full QC reports soon...
jetjr is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:36 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO