znasim09 08-10-2016 03:12 AM

How to find Trans gene integration position in the genome?
I was wondering how can I find where my gene of interest is integrated in the Arabidopsis genome? I have the genome sequence data for the transgenic line expressing my gene of interest.
Any suggestions? :):)

tristan dubos 08-10-2016 04:06 AM

You can try an assembly of your mutants genomes and then map your genes references against your references genome created (could be complicate :) ). I know there is some assembler witch start with a reference ( then you can use sequences of you genes of interests).
An other solution consist to map your reads from sequencing into the references of your interest genes and looks the patterns bordering references and map again this sequences border into the genomes references to know the localizations .

znasim09 08-10-2016 04:21 PM

Thanks Tristan for your reply. But as I am not a bioinformatician, could you be a bit simpler? I didn't understand much :(

tristan dubos 08-10-2016 11:09 PM

I' am sorry for you but i don't really know a simple solution for that , i mean there no software i know for this specific question. The solution has to be a combination of software a bit complicate if you are not a bioinformatician, may be someone else have a simpler solution ...
I can expose to you all the software you can use but it will takes a lot of time to install and parametrize it if you are not familiar with bioinformatic ....

Witch kind of sequencing did you make ?

znasim09 08-11-2016 05:10 PM

Thanks for the time. I said I am not a bioinformatician but I am not that bad in installing and executing programs. Can you mention the programs/steps that I need to do? I am pretty sure that I can do it.
And we did whole genome sequencing (illumina, paired ends).

tristan dubos 08-17-2016 11:04 PM

You can do it with bwa (it s an aligner) who will map your sequencing illumina against your references . You need to modify your references genes with N before and after ( N is considering as any nucleotide ) and in the bwa a parameter you need to permit mismatch on this N. You have to care about the minimum number of match you consider at the border of your sequences to be sure of the identity of you sequences. That the way to define the number of N considering the length of you reads from the sequencing.
Then you will looked at the sequences who mapped on the N. To extract this sequences i don t really know tools it more bio-informatics code ...
I will looked how to make a consensus for this part it may be a easier way .


