SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Quality triming of Solid reads in BWA santy SOLiD 3 12-18-2012 05:54 PM
SOLiD WTP alignment file: representation of spliced reads Simon Anders Bioinformatics 0 08-19-2010 09:29 AM
BWA alignment for paired end reads AvinashP Genomic Resequencing 2 06-08-2010 03:11 AM
Mapping SOliD reads to a Newbler 454 alignment to correct errors Bukowski Bioinformatics 0 03-09-2010 02:20 AM
Alignment of ABI solid reads and 454 reads baohua100 Bioinformatics 2 02-23-2009 04:58 PM

Reply
 
Thread Tools
Old 07-12-2010, 02:22 AM   #1
m_elena_bioinfo
Member
 
Location: Ospedali Riuniti di Bergamo, ITALY

Join Date: Oct 2009
Posts: 99
Default PE SOLiD reads alignment by bwa

Dear users,
I have PE reads from SOLiD to align to human genome.
I have these files:

- solid_data_F3.csfasta
- solid_data_F3_QV.qual
- solid_data_F5-P2.csfasta
- solid_data_F5-P2_QV.qual

I want to convert in fastq these files by using bwa0.5.7/solid2fastq.pl
This script runs only for F3 but with F5-P2 the program doesn't run. (it says Fail to open solid_data_F5-P2_F3.csfasta)

So, if I use:
> solid2fastq.pl solid_data_ solid_data_total
I generate only one file fastq for F3 and F5-P2. It includes all the paired-end?

This fastq is in colorspace but the colors are represented as ACTG.
So to index the genome and to perform bwa alignment, have I to use -c option?

Thanks a lot,
ME
m_elena_bioinfo is offline   Reply With Quote
Old 07-12-2010, 03:58 PM   #2
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by m_elena_bioinfo View Post
Dear users,
I have PE reads from SOLiD to align to human genome.
I have these files:

- solid_data_F3.csfasta
- solid_data_F3_QV.qual
- solid_data_F5-P2.csfasta
- solid_data_F5-P2_QV.qual

I want to convert in fastq these files by using bwa0.5.7/solid2fastq.pl
This script runs only for F3 but with F5-P2 the program doesn't run. (it says Fail to open solid_data_F5-P2_F3.csfasta)

So, if I use:
> solid2fastq.pl solid_data_ solid_data_total
I generate only one file fastq for F3 and F5-P2. It includes all the paired-end?

This fastq is in colorspace but the colors are represented as ACTG.
So to index the genome and to perform bwa alignment, have I to use -c option?

Thanks a lot,
ME
It looks like the script doesn't support the paired end protocol. Bug the BWA mailing list (bio-bwa-help@lists.sourceforge.net) or the author (username:lh3).
nilshomer is offline   Reply With Quote
Old 07-13-2010, 08:26 AM   #3
drio
Senior Member
 
Location: 4117'49"N / 24'42"E

Join Date: Oct 2008
Posts: 323
Default

If you want to use the script with the PE data make this change in the script:

98 #if (/^>(\d+)_(\d+)_(\d+)_[FR]3/) {
99 if (/^>(\d+)_(\d+)_(\d+)_[F3|R3|F5-P2]/) {

And also rename the F5-P2 to R3:

solid_data_F5-P2.csfasta -> solid_data_R3.csfasta
solid_data_F5-P2_QV.qual -> solid_data_R3_QV.qual

Also, bfast has a solid2fastq (in the git repo) that supports now bwa output and
handles PE data. You can use that too.
__________________
-drd
drio is offline   Reply With Quote
Old 07-14-2010, 05:30 AM   #4
m_elena_bioinfo
Member
 
Location: Ospedali Riuniti di Bergamo, ITALY

Join Date: Oct 2009
Posts: 99
Default

Thanx very much for your help Drio!
I'll try and let you know if the program run!
m_elena_bioinfo is offline   Reply With Quote
Old 07-14-2010, 06:27 AM   #5
SoftGenetics
Registered Vendor
 
Location: pa

Join Date: Apr 2009
Posts: 32
Default

Quote:
Originally Posted by m_elena_bioinfo View Post
Dear users,
I have PE reads from SOLiD to align to human genome.
I have these files:

- solid_data_F3.csfasta
- solid_data_F3_QV.qual
- solid_data_F5-P2.csfasta
- solid_data_F5-P2_QV.qual

I want to convert in fastq these files by using bwa0.5.7/solid2fastq.pl
This script runs only for F3 but with F5-P2 the program doesn't run. (it says Fail to open solid_data_F5-P2_F3.csfasta)

So, if I use:
> solid2fastq.pl solid_data_ solid_data_total
I generate only one file fastq for F3 and F5-P2. It includes all the paired-end?

This fastq is in colorspace but the colors are represented as ACTG.
So to index the genome and to perform bwa alignment, have I to use -c option?

Thanks a lot,
ME
You will loose a lot of information by converting the color space files to fasta, you would be better off aligning the solid reads to a color space reference

John
SoftGenetics is offline   Reply With Quote
Old 07-14-2010, 06:56 AM   #6
drio
Senior Member
 
Location: 4117'49"N / 24'42"E

Join Date: Oct 2008
Posts: 323
Default

There is information lost because of the dinucleotide 'color' encoding but the alignments are performed in CS (http://seqanswers.com/forums/showthread.php?t=5245). BWA will do a good job aligning those reads.
__________________
-drd
drio is offline   Reply With Quote
Old 07-14-2010, 07:34 AM   #7
SoftGenetics
Registered Vendor
 
Location: pa

Join Date: Apr 2009
Posts: 32
Default

Quote:
Originally Posted by drio View Post
There is information lost because of the dinucleotide 'color' encoding but the alignments are performed in CS (http://seqanswers.com/forums/showthread.php?t=5245). BWA will do a good job aligning those reads.
We utilize a modified BWA in our NextGENe software which adds a couple of additional steps to the BWA alignment, creating a much more robust alignment, addtionally, we utilize a fully annotated color space reference so no information is lost, if you would like to try, we can supply a trial.
John
SoftGenetics is offline   Reply With Quote
Old 07-14-2010, 07:39 AM   #8
drio
Senior Member
 
Location: 4117'49"N / 24'42"E

Join Date: Oct 2008
Posts: 323
Default

Cool, any plans to integrate that into the main bwa repo?
__________________
-drd
drio is offline   Reply With Quote
Old 02-21-2011, 08:09 AM   #9
Agent47
Junior Member
 
Location: Philadelphia

Join Date: Jan 2009
Posts: 3
Default

Thanks! Elena and drio

This was useful. i am trying to run the solid pe barcoded analysis.
I have submitted it to run just now.
I hope this works.
Agent47 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:44 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO