SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Average Read Coverage for 454 paired end read data lisa1102 Core Facilities 8 10-18-2011 09:40 AM
BFAST for SOLiD paired end reads epigen Bioinformatics 31 09-03-2011 06:20 AM
How to map SOLiD paired end reads by Bfast beliefbio Bioinformatics 1 12-29-2010 01:55 AM
bfast paired end flag guavajuice Bioinformatics 0 11-02-2010 12:37 PM
BFAST indexing for 50+36 SOLiD paired end? krobison Bioinformatics 6 11-02-2010 06:25 AM

Reply
 
Thread Tools
Old 02-16-2012, 08:37 AM   #1
idonaldson
Member
 
Location: Manchester, UK

Join Date: Oct 2009
Posts: 37
Default SOLiD paired-end read analysis with BFAST 0.7.0a

I am trying to get going with paired-end mapping using BFAST 0.7.0a, but stalling so far.

I have 50bp F3 reads and 35bp F5-BC reads from a SOLiD4.

This is my current workflow:
- filterer the reads to remove poor quality reads

- combine F3 and F5-BC read files to an interleaved FASTQ file using solid2fastq.

- remove reads that do not have a corresponding partner (due to quality filtering step)
so only paired 50bp and 35bp reads in FASTQ, e.g.:
Code:
@2_58_1022
T11030012.13.031.13..220020201100120232000.0010..00
+
8B>@7<A9!A=!A?9!BA!!;B?B:@[email protected]=?7/B??=/!&?.4!!><
@2_58_1022
G03012022110321000220002310222320030
+
.386;BA749?>50<@[email protected]:=7+6'1)3+&4%/
@2_59_549
T32212131.1203201231302132200022200213130200102.020
+
[email protected]@@[email protected][email protected]@@[email protected]>[email protected]@=<>>6?(//;=*;6?60'1+<1!0'?
@2_59_549
G31302031000200031220202301033020330
+
93<A=B/>7+'525=;[email protected]/=4/?>:@9/=B6B
- run bfast match
Code:
bfast match -f genome.fa -A 1 -n 1 -t -r interleaved.fastq > interleaved.bmf
- run bfast localalign
Code:
bfast localalign -f genome.fa -m interleaved.bmf -A 1 -n 1 -t > interleaved.baf
-run bfast postprocess
Code:
bfast postprocess -f genome.fa -i interleaved.baf -A 1 -O 1 -n 1 -t -a 2 -Y 0 > interleaved.sam
- convert SAM to BAM and run flagstat
Code:
31104184 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
8402049 + 0 mapped (27.01%:-nan%)
31104184 + 0 paired in sequencing
15552092 + 0 read1
15552092 + 0 read2
0 + 0 properly paired (0.00%:-nan%)
0 + 0 with itself and mate mapped
8402049 + 0 singletons (27.01%:-nan%)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
No valid pairs!

If i map the F3 and F5_BC individually i get about 55% mapping for F3, but 0% for F5_BC.
The flags in the SAM file also indicate the reads are indeed in pairs, bit the downstream read does not match (89 for F3 and 165 for F5_BC).

So the F5_BC is not mapping at all for some reason. Am i doing something silly here?!

Thanks for any guidance!

Ian
idonaldson is offline   Reply With Quote
Old 02-16-2012, 09:21 AM   #2
Chipper
Senior Member
 
Location: Sweden

Join Date: Mar 2008
Posts: 324
Default

Maybe your index is longer than 35 bp? I don't see how else you could get 0% mapping... Is the -n 1 flag specifying which index is used?
Chipper is offline   Reply With Quote
Old 02-17-2012, 01:03 AM   #3
idonaldson
Member
 
Location: Manchester, UK

Join Date: Oct 2009
Posts: 37
Default

@Chipper - The index was created using 'bfast index' using the default seeds. '-n 1' just specifies the number of threads used by the program.

Thanks for your reply.
idonaldson is offline   Reply With Quote
Old 03-09-2012, 10:39 AM   #4
Patidar
Member
 
Location: NIH

Join Date: Feb 2012
Posts: 10
Default

I am running bfast on SOLiD PE data with the default ssetting. After I got the bam files I loaded them into IVG and got indels and lots of snps(more then what I was getting using bioscpe). Approximately every reed have 1 or more indels in it.
I'd appreciate if someone can help me out with the parameters I should choose to get a better alignment.

Thanks
Patidar is offline   Reply With Quote
Old 03-20-2012, 09:31 PM   #5
kenietz
Member
 
Location: Singapore

Join Date: Nov 2011
Posts: 85
Default

Hi guys,
i got the same strange result when working with SOLID PE data:

----- FLAGSTAT-----

151746110 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
60525771 + 0 mapped (39.89%:-nan%)
151746110 + 0 paired in sequencing
75873055 + 0 read1
75873055 + 0 read2
0 + 0 properly paired (0.00%:-nan%)
0 + 0 with itself and mate mapped
60525771 + 0 singletons (39.89%:-nan%)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

--------------------------------

But may be is connected to this warning i got from 'bfast postprocess':
bfast postprocess -f hg19.fa -i reads.baf -A 1 -O 1 -n 4 -Y 0 -a 3 -z > reads.sam

------ warning ------

Estimating paired end distance...
Found only 0 distances to infer the insert size distribution
************************************************************
In function "GetPEDBins": Warning[OutOfRange]. Variable/Value: b->numDistances.
Message: Not enough distances to infer insert size distribution.
***** Warning *****
************************************************************
Reads processed: 100000
************************************************************
************************************************************

------------------------

What that error means at all?

Thank you for you help!
kenietz is offline   Reply With Quote
Reply

Tags
bfast, paired-end, solid

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:16 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO