Hi everyone.
Here, i'm trying to open one gap on a few reads.
To test that things, i use 1.10⁶ reads from illumina sequencing.
i have modify my reference genom by insertion of 2 exogenous sequence that simulate fake gap (1 and 2kb).
In a 1rst step, i have validated that my reads can overlap the 2 side of the fake gap by using topHat2. To deal with that tools i have added common splicing site to the both end of the fake gap to simulate exon/exon junction.
Now my reference genome look like :
And.. It's work great. Thus, i know that a few reads from my data set can overlap the fake gap ( One gap per read ).
In a second time, i try to deal with the bowtie2 parameter (i,e Scoring options )
here, i present a few cmd line used for this experiment:
--ignore-quals(because i dont care of my Q phred value here)
--mp 60000( i don't want mismatch -> tophat2 aligned my reads without mismatch)
--rdg 1,1 (i d'ont know why but i can't set value under 1 from this 2 parameter. In a dream world i want to set <int2>=0 for --rdg <int1>,<int2> and find a good value for <int1> to have just one gap per read )
--score-min L,-1000000,-1000000 (this is a extrem threshold to get better chance to open a gap)
result->no gap open + clean alignement ( no mismatch)
result->no gap open + mismatch
--gbar 25 (to have a significant number of read base overlapping the both side of the fake gap and to overcome a issue with a gap penalties in the 1rst seeding (seed=22))
result->no gap open + mismatch
So, i have 3 question:
1_if we follow the manual page, we can read that the gap penalties is calculate on this base : <int1> + N * <int2>. with my 1kb gap length (N=1000) and --rdg 1,1 a gap penalties is around 2000.
with a threshold set at -1000000 (my read length is 100 for the x of the threshold in f(x) = 0 + -0.6 * x), why no gap is open on read overlaping the fake gap ?
wich paramater can i set up to open a gap on few read from my data set ?
2_If tophat can deal with my fake gap, can i set up tophat to deal with other gap different from exon/exon junction ?
3_Do you know an other tools that can deal with my problem ?
Thank you a lot to read me ( sorry for that large post ) and thanks in advance for any reply.
Rémi
Here, i'm trying to open one gap on a few reads.
To test that things, i use 1.10⁶ reads from illumina sequencing.
i have modify my reference genom by insertion of 2 exogenous sequence that simulate fake gap (1 and 2kb).
In a 1rst step, i have validated that my reads can overlap the 2 side of the fake gap by using topHat2. To deal with that tools i have added common splicing site to the both end of the fake gap to simulate exon/exon junction.
Now my reference genome look like :
Code:
<reference_genom_seq>'GT'<fake_gap>'AG'<reference_genom_seq>
In a second time, i try to deal with the bowtie2 parameter (i,e Scoring options )
here, i present a few cmd line used for this experiment:
Code:
bowtie2 -p 4 --no-unal --ignore-quals --mp 60000 --rdg 1,1 --score-min L,-100000,-100000 -x my_ref_genom my_fastq -S my_aln.sam
--mp 60000( i don't want mismatch -> tophat2 aligned my reads without mismatch)
--rdg 1,1 (i d'ont know why but i can't set value under 1 from this 2 parameter. In a dream world i want to set <int2>=0 for --rdg <int1>,<int2> and find a good value for <int1> to have just one gap per read )
--score-min L,-1000000,-1000000 (this is a extrem threshold to get better chance to open a gap)
result->no gap open + clean alignement ( no mismatch)
Code:
bowtie2 -p 4 --no-unal --rdg 1,1 -x my_ref_genom my_fastq -S my_aln.sam
Code:
bowtie2 -p 4 --no-unal --ignore-quals --gbar 25 --rdg 1,1 -x my_ref_genom my_fastq -S my_aln.sam
result->no gap open + mismatch
So, i have 3 question:
1_if we follow the manual page, we can read that the gap penalties is calculate on this base : <int1> + N * <int2>. with my 1kb gap length (N=1000) and --rdg 1,1 a gap penalties is around 2000.
with a threshold set at -1000000 (my read length is 100 for the x of the threshold in f(x) = 0 + -0.6 * x), why no gap is open on read overlaping the fake gap ?
wich paramater can i set up to open a gap on few read from my data set ?
2_If tophat can deal with my fake gap, can i set up tophat to deal with other gap different from exon/exon junction ?
3_Do you know an other tools that can deal with my problem ?
Thank you a lot to read me ( sorry for that large post ) and thanks in advance for any reply.
Rémi
Comment