SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > SOLiD



Similar Threads
Thread Thread Starter Forum Replies Last Post
csfasta + qual to fastq moriah Bioinformatics 0 07-05-2012 05:11 AM
Script for converting fastq back to color csfasta and qual pepperoni Bioinformatics 2 03-14-2012 10:02 AM
Error Bowtie with fastq files (from Solid .csfasta & .qual) pepperoni Bioinformatics 0 10-12-2011 09:20 PM
extract subset (mapped reads) from csfasta and .qual files KevinLam SOLiD 1 01-18-2010 01:38 AM
fastq to csfasta and .qual samt SOLiD 15 10-29-2009 10:11 AM

Reply
 
Thread Tools
Old 04-06-2016, 10:30 PM   #1
znasim09
Member
 
Location: Seoul, Korea

Join Date: Sep 2015
Posts: 23
Unhappy Any video tutorial for alignment of csfasta and qual files?? Or its fastq conversion?

Hello everyone,

For the last few days I have been searching different forums to understand converting csfasta and qual files to fastq and/or to align the files. But due to limited (or I can say no knowledge) of scripts etc, I was unable to do so.

That's why I would like to watch a video tutorial doing these stuff. Is there any?? I want to convert/align Arabidopsis 1001 genomes data.
znasim09 is offline   Reply With Quote
Old 04-07-2016, 12:20 AM   #2
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,479
Default

Make your life easier and just use an aligner (e.g., bowtie) that can directly handle these types of files.
dpryan is offline   Reply With Quote
Old 04-07-2016, 02:40 AM   #3
znasim09
Member
 
Location: Seoul, Korea

Join Date: Sep 2015
Posts: 23
Default

@dpryan
Thanks for the reply.
I indexed Arabidopsis genome using the script "make_a_thaliana_tair,sh"
But, There are plenty of options that I couldn't understand (as I am a beginner). Can you please modify this command

bowtie [options]* <ebwt> {-1 <m1> -2 <m2> | --12 <r> | <s>} [<hit>]

for an example data, e.g. Sample_F3.csfasta and Sample_F3_QV.qual ??
znasim09 is offline   Reply With Quote
Old 04-07-2016, 03:01 AM   #4
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 667
Default

Do you have single end or paired-end reads?

Have you seen the section of the Bowtie manual about aligning colorspace reads?

http://bowtie-bio.sourceforge.net/ma...pace-alignment

If you have paired end _F3.csfasta (forward reads) correspond to -1, or the first reads of a pair, and _R3.csfasta (reverse reads) correspond to -2, the second reads of a pair.
mastal is offline   Reply With Quote
Old 04-07-2016, 03:15 AM   #5
znasim09
Member
 
Location: Seoul, Korea

Join Date: Sep 2015
Posts: 23
Default

@Mastal
I read colorspace alignment section quite a few times but couldn't understand it properly . The data I want to analyze is single end.
I downloaded the SRA file and converted it to csfasta and qual using SRAtoolkit (abi-dump). That's why I asked if anyone can refer me to a video tutorial or can modify the command.
znasim09 is offline   Reply With Quote
Old 04-07-2016, 04:17 AM   #6
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 667
Default

See the list of options in the Bowtie manual, particularly options -C, and -Q, and note that you have to build a colorspace Bowtie index.

See the 'Getting Started' section of the Bowtie webpages, but because you have colorspace reads you need to build a colorspace index,
and also use the options -C and -Q that indicate you have colorspace reads and your base quality values are in a separate file.

There are 2 steps to running bowtie. 1. make an index of your genome 2. run the alignment

To run the bowtie alignment the basics are:1. you have to specify the path to the files with the bowtie index 2. you have to specify the path to the files with the reads (and the base qualities) you want to align.

Last edited by mastal; 04-07-2016 at 04:24 AM.
mastal is offline   Reply With Quote
Old 04-07-2016, 11:54 PM   #7
znasim09
Member
 
Location: Seoul, Korea

Join Date: Sep 2015
Posts: 23
Default

Hey mastal,

I color indexed the reference genome with the command

perl bowtie-build --wrapper basic-0 -C /Users/znasim09/Documents/Perl_packages/bowtie-1.1.2/genomes/Ath_reference.fa output.ebwt

It generated 6 files (as I was expecting).
Then i aligned the csfasta file using this command

bowtie -f -C -S a_thaliana test_F3.csfasta > test_F3.sam

It runs, and gives this info

# reads processed: 6221913
# reads with at least one reported alignment: 823401 (13.23%)
# reads that failed to align: 5398512 (86.77%)
Reported 823401 alignments to 1 output stream(s)

What can be the possible reasons for such high percentage of un-aligned reads?
Plus, the out.sam has zero kb size. I dont know why
znasim09 is offline   Reply With Quote
Old 04-08-2016, 02:25 AM   #8
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 667
Default

when you do the alignment you also need to use the base qualities file, so add to your command -Q Sample_F3_QV.qual.
mastal is offline   Reply With Quote
Old 04-08-2016, 02:39 AM   #9
znasim09
Member
 
Location: Seoul, Korea

Join Date: Sep 2015
Posts: 23
Default

I tried to add the qual file by using this command:
perl bowtie -f -C -Q test_F3.csfasta a_thaliana test_F3_QV.qual > out_F3.sam

But it gave the following info:

Warning: could not parse quality line:
111>SRR309164.1 1:1086:19215
T013213212300222033031011....12...10...............................................
>libc++abi.dylib: terminating with uncaught exception of type int
Abort trap: 6
znasim09 is offline   Reply With Quote
Old 04-08-2016, 10:15 AM   #10
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 667
Default

Put the -Q immediately before the name of the F3_QV.qual file.
mastal is offline   Reply With Quote
Old 04-08-2016, 09:56 PM   #11
znasim09
Member
 
Location: Seoul, Korea

Join Date: Sep 2015
Posts: 23
Default

Hey mastal,
I used this command:
bowtie -f -C a_thaliana test_F3.csfasta -Q test_F3_QV.qual > out_F3.sam

But still I am getting the same results (Only 13% mapping)

# reads processed: 6221913
# reads with at least one reported alignment: 855943 (13.76%)
# reads that failed to align: 5365970 (86.24%)
Reported 855943 alignments to 1 output stream(s)
znasim09 is offline   Reply With Quote
Old 04-09-2016, 01:58 AM   #12
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 667
Default

Maybe there is something wrong with the way fastq-dump converted the data.

You could try aligning with BFAST, which was specifically written to deal with SOLID data.
mastal is offline   Reply With Quote
Old 04-09-2016, 05:36 AM   #13
Chipper
Senior Member
 
Location: Sweden

Join Date: Mar 2008
Posts: 324
Default

solid reads had a high raw error rate. bowtie with default settings is not ideal for 75 bp reads so 13 % seems about right. bfast will do a better job but start with trimmed reads and change parameters v and n.
Chipper is offline   Reply With Quote
Old 04-09-2016, 12:21 PM   #14
gringer
David Eccles (gringer)
 
Location: Wellington, New Zealand

Join Date: May 2011
Posts: 823
Default

Never convert colour space sequences to standard base space sequences prior to alignment. While you definitely will prefer the base-space representation of alignment, it's going to result in a lot of errors that aren't present in a colour space alignment.

To see why this is a bad idea, see my previous verbose rants.
gringer is offline   Reply With Quote
Reply

Tags
alignment, csfasta, solid data analysis

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:55 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO