Hi everyone,
I hope my question is not a total repeat of previous ones, but I did have a read through some previous threads and couldn't find a clear answer.
I have some RNA-seq data of a non-model species (fish), from a single sample that was sent for ~5GB sequencing and bioinformatics analysis at BGI.
The data was mapped and assembled to a transcriptome using SOAPdenovo (very limited one, I know because of lack of repetitions and coverage, but it was a preliminary trial to have a basic idea what we're looking at).
That was roughly a year ago. Currently, a draft genome of a closely related species was published (contigs and scaffolds, not annotated).
My question is, how can I use this reference genome to improve my transcriptome assembly, or to re-assemble it using that genome as a reference and compare the results to the de-novo assembly I already got.
I would also like to use the genome as a reference to detect splice junctions and focus on alternative splicing of specific genes.
I have tried mapping the reads to the published genome reference using STAR, but I'm not sure how to analyse the resulting BAM files and how to use them downstream for assembly or comparison with the de-novo assembly that I got
Last, I would like to re-annotate the new assembly that I will get, what would be the best tool for it?
I would prefer to use open source programs, and I have access to a strong linux server. (32 cpu, 1TB ram)
I am pretty new to NGS bioinformatics, but I'm eager to learn and would appreciate your feedback.
Thanks, Ido
I hope my question is not a total repeat of previous ones, but I did have a read through some previous threads and couldn't find a clear answer.
I have some RNA-seq data of a non-model species (fish), from a single sample that was sent for ~5GB sequencing and bioinformatics analysis at BGI.
The data was mapped and assembled to a transcriptome using SOAPdenovo (very limited one, I know because of lack of repetitions and coverage, but it was a preliminary trial to have a basic idea what we're looking at).
That was roughly a year ago. Currently, a draft genome of a closely related species was published (contigs and scaffolds, not annotated).
My question is, how can I use this reference genome to improve my transcriptome assembly, or to re-assemble it using that genome as a reference and compare the results to the de-novo assembly I already got.
I would also like to use the genome as a reference to detect splice junctions and focus on alternative splicing of specific genes.
I have tried mapping the reads to the published genome reference using STAR, but I'm not sure how to analyse the resulting BAM files and how to use them downstream for assembly or comparison with the de-novo assembly that I got
Last, I would like to re-annotate the new assembly that I will get, what would be the best tool for it?
I would prefer to use open source programs, and I have access to a strong linux server. (32 cpu, 1TB ram)
I am pretty new to NGS bioinformatics, but I'm eager to learn and would appreciate your feedback.
Thanks, Ido
Comment