SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Creating psuedo paired-end sequencing reads from single-end reads cburke04 Bioinformatics 6 01-14-2015 06:10 PM
Uniquely mapped reads and difference for single end and paired end reads gene_x Bioinformatics 2 01-13-2015 12:55 PM
50 bp paired end reads vs. 100 bp single end reads efoss Bioinformatics 12 01-15-2014 08:05 PM
How to count number of mapped paired-end and single-end rna-seq reads repinementer Bioinformatics 8 01-06-2013 05:06 AM
Can Cuffdiff treat paired-end and single-end reads at the same time? zun RNA Sequencing 3 06-12-2012 05:37 PM

Reply
 
Thread Tools
Old 02-24-2015, 11:46 AM   #1
mlodato
Junior Member
 
Location: Boston, MA

Join Date: Aug 2013
Posts: 8
Default Convert paired-end, longer reads to single-end, shorter reads

Hi,
Possibly a very dumb question, but if I have a dataset of paired-end, 150bp reads, but a computational tool that is formatted to analyze single-end 50bp reads. I am wondering if it is practically feasible to only consider 1 end for each read, and clip that read to 50 bp, and use it. Also, is it logical to do so, in other words am I missing some reason why this is a bad idea?

Thanks for any help!
mlodato is offline   Reply With Quote
Old 02-24-2015, 11:50 AM   #2
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

This depends on what you want to use the data for. The approach you suggest is certainly feasible to force an old tool to handle new data, but at the expense of discarding a huge amount of useful data and basically undoing 5 years of advancement in sequencing technology. The best approach would usually be to find or make a tool that is flexible enough to accept and utilize modern data, if you want the most accurate possible answer. What does the tool do?
Brian Bushnell is offline   Reply With Quote
Old 02-24-2015, 12:10 PM   #3
mlodato
Junior Member
 
Location: Boston, MA

Join Date: Aug 2013
Posts: 8
Default

Thanks, I've thought about the issues you've raised and agree with you. I want to use this tool (http://www.nature.com/nprot/journal/....2012.039.html), which measures copy number. I will be using this analysis as a QC on low-coverage sequencing to pick out bad samples before high-coverage sequencing, so throwing out lots of data would not be the end of the world.
mlodato is offline   Reply With Quote
Old 01-29-2016, 08:45 AM   #4
proux
Junior Member
 
Location: London

Join Date: Nov 2009
Posts: 6
Default

This is my first post, and I have the same question - although the tool I have in mind is the newer qDNAseq, but it seems to also want single end 50bp reads. https://www.bioconductor.org/package...l/QDNAseq.html
proux is offline   Reply With Quote
Old 01-29-2016, 09:03 AM   #5
proux
Junior Member
 
Location: London

Join Date: Nov 2009
Posts: 6
Default

i guess this should allow at least the split into single ended
http://bedtools.readthedocs.org/en/l...amtofastq.html
proux is offline   Reply With Quote
Old 01-29-2016, 12:42 PM   #6
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,049
Default

Quote:
Originally Posted by proux View Post
This is my first post, and I have the same question - although the tool I have in mind is the newer qDNAseq, but it seems to also want single end 50bp reads. https://www.bioconductor.org/package...l/QDNAseq.html
Where does it say that it requires 50 bp reads? I looked through the reference manual. Perhaps I missed that part.
GenoMax is offline   Reply With Quote
Old 01-29-2016, 12:44 PM   #7
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,049
Default

Quote:
Originally Posted by proux View Post
i guess this should allow at least the split into single ended
http://bedtools.readthedocs.org/en/l...amtofastq.html
If data you are looking at is paired-end then split it and use just one end (if that is all you need). Generally R1 (first) read would be one to use.
GenoMax is offline   Reply With Quote
Old 01-30-2016, 05:47 AM   #8
proux
Junior Member
 
Location: London

Join Date: Nov 2009
Posts: 6
Default

thanks. 50 bp reads is the only input accepted in the galaxy version. It is not explicitly stated in the R package, but it is what they used in the paper for their own data, and they truncated 1000 genomes data to the first 50 bp
proux is offline   Reply With Quote
Reply

Tags
fasta file, next gen sequencing data, paired end reads, sequencing

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:03 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO