hi, I want to create a file using biopython or python or java .for each bed file range, the particular ATGC sequence need to be retrieved and all the result is to be put in this file.
for example, bed file contains like this:
chr1 14362 14829 chr1:14363-14829:WASH5P 467 +
chr1 14969 15038 chr1:14970-15038:WASH5P 69 +
chr1 15795 15947 chr1:15796-15947:WASH5P 152 +
chr2 14362 14829 chr2:14363-14829:WASHP 467 +
chr3 14969 15038 chr3:14970-15038:WASH 69 +
chr10 15795 15947 chr10:15796-15947:WASHOP 152 +
..........................................................................................
........................................................................................
and fasta file contains like this:
>chr1 dna:chromosome chromosome:GRCh37:1:1:249250621:1
NNNNNGCCAAGTnggggctaaNNNNGGGGCCCCCCCCCCCCCcCCC
>chr2 dna:chromosome chromosome:GRCh37:1:1:249250621:1
NNNNNGCCAAGNNNNGCCAAGT
nggggctaaNNNNGCCAAGT
nggggctaaNNNNGCCAAGT
nggggctaa
>chr3 dna:chromosome chromosome:GRCh37:1:1:249250621:1
AGTACNNNNGCCAAGT
nggggctaaNNNNGCCAAGT
nggggctaa
................................................
...................................
Now I want the result file like the following:
chr1 14362 14829 chr1:14363-14829:WASH5P 467 +
ATTTGCCCCCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
ATTTTTTTTTTTTTTTTTTTTTTTTTTTTTTGGGGGGGGGGGGGGGGGGGGG
chr1 14969 15038 chr1:14970-15038:WASH5P 69 +
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNATTTTTTTTTT
AAAAAAAAAAAATTTTTTTTTTTTTTTTTTTTTt
itmeans for a particular range I have to retrieve the particular fasta sequence and put them into the file
I did it in java but it is taking very long time...may be I am doing some mistake please help me such that I can use some library function and run the program easily
for example, bed file contains like this:
chr1 14362 14829 chr1:14363-14829:WASH5P 467 +
chr1 14969 15038 chr1:14970-15038:WASH5P 69 +
chr1 15795 15947 chr1:15796-15947:WASH5P 152 +
chr2 14362 14829 chr2:14363-14829:WASHP 467 +
chr3 14969 15038 chr3:14970-15038:WASH 69 +
chr10 15795 15947 chr10:15796-15947:WASHOP 152 +
..........................................................................................
........................................................................................
and fasta file contains like this:
>chr1 dna:chromosome chromosome:GRCh37:1:1:249250621:1
NNNNNGCCAAGTnggggctaaNNNNGGGGCCCCCCCCCCCCCcCCC
>chr2 dna:chromosome chromosome:GRCh37:1:1:249250621:1
NNNNNGCCAAGNNNNGCCAAGT
nggggctaaNNNNGCCAAGT
nggggctaaNNNNGCCAAGT
nggggctaa
>chr3 dna:chromosome chromosome:GRCh37:1:1:249250621:1
AGTACNNNNGCCAAGT
nggggctaaNNNNGCCAAGT
nggggctaa
................................................
...................................
Now I want the result file like the following:
chr1 14362 14829 chr1:14363-14829:WASH5P 467 +
ATTTGCCCCCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
ATTTTTTTTTTTTTTTTTTTTTTTTTTTTTTGGGGGGGGGGGGGGGGGGGGG
chr1 14969 15038 chr1:14970-15038:WASH5P 69 +
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNATTTTTTTTTT
AAAAAAAAAAAATTTTTTTTTTTTTTTTTTTTTt
itmeans for a particular range I have to retrieve the particular fasta sequence and put them into the file
I did it in java but it is taking very long time...may be I am doing some mistake please help me such that I can use some library function and run the program easily