Hi, I have downloaded a dataset of approximately 16Gb of reads with a length of 17 bases from the SRA, and I'd like to find reads for a specific gene / its exons.
My current strategy is to format a blast database and just search for the exon sequences. Is that a good way? First of all, the SRA blast database doesn't contain the dataset in question.
I also thought about assembling the transcriptome such that I could have some sort of sequences for all the exons in the data. But as I have seen this is not so easy, or even impossible. I also don't have access to a computer with more than 32Gb RAM.
Basically I just want to extract the short reads of the exons in question to help with resequencing the gene. I also already know that this gene is highly underexpressed in the tissue the SRA data was sequenced from.
My current strategy is to format a blast database and just search for the exon sequences. Is that a good way? First of all, the SRA blast database doesn't contain the dataset in question.
I also thought about assembling the transcriptome such that I could have some sort of sequences for all the exons in the data. But as I have seen this is not so easy, or even impossible. I also don't have access to a computer with more than 32Gb RAM.
Basically I just want to extract the short reads of the exons in question to help with resequencing the gene. I also already know that this gene is highly underexpressed in the tissue the SRA data was sequenced from.