I would like to obtain short-read Illumina HiSeq 2000 transcriptomic data (RNA-Seq) for individuals representing closely-related diploid and polyploid plant species. Our aim is to identify and analyze comparatively the putative orthologues, paralogues and homeologes involved in parallel allopolyploidization events. We also want to develop new phylogenetic markers for the genus from the transcriptomic data.
I am new to transcriptomics (this is my first post ) and I want to make sure that we are doing the most appropriate sample prep for our study system and questions. Apoligies if my questions are naive!
For our species of interest, which are diploid/polyploid plants, with large-ish genomes, and NO reference genome currently available, I had thought 2x100 paired end reads would be the best approach based on forum posts and discussions with colleagues. However one sequencing facility ("facility A") that I wrote to asking for a quote advised me to do 1x100 single end reads instead, because they said that their fragmentation step in the cDNA library prep produces average fragment sizes of 125-150bp, so paired end reads would not provide any additional information and would not be worth the extra cost. However, another sequencing facility ("facility B") that sent me a quote said they would generate fragments ranging from 125-450 for paired end sequencing.
So, which is the best approach? My two main quesitons at this stage are, 1) should we do single end 1x100 or paired end 2x100 reads? and 2) what is the ideal fragment size (or fragment size range) we should be using? In theory wouldn't it provide more information (and make the downstream bioinformatics easier) to generate paired end reads from a larger range of fragment sizes (i.e. facility B's quote), rather than generate single end reads on a smaller range of fragment sizes (i.e. facility A's advice)? Or are there tradeoffs with respect to using larger fragment sizes or a larger RANGE of fragment sizes?
I look forward to any suggestions people may have. Thanks in advance.
I am new to transcriptomics (this is my first post ) and I want to make sure that we are doing the most appropriate sample prep for our study system and questions. Apoligies if my questions are naive!
For our species of interest, which are diploid/polyploid plants, with large-ish genomes, and NO reference genome currently available, I had thought 2x100 paired end reads would be the best approach based on forum posts and discussions with colleagues. However one sequencing facility ("facility A") that I wrote to asking for a quote advised me to do 1x100 single end reads instead, because they said that their fragmentation step in the cDNA library prep produces average fragment sizes of 125-150bp, so paired end reads would not provide any additional information and would not be worth the extra cost. However, another sequencing facility ("facility B") that sent me a quote said they would generate fragments ranging from 125-450 for paired end sequencing.
So, which is the best approach? My two main quesitons at this stage are, 1) should we do single end 1x100 or paired end 2x100 reads? and 2) what is the ideal fragment size (or fragment size range) we should be using? In theory wouldn't it provide more information (and make the downstream bioinformatics easier) to generate paired end reads from a larger range of fragment sizes (i.e. facility B's quote), rather than generate single end reads on a smaller range of fragment sizes (i.e. facility A's advice)? Or are there tradeoffs with respect to using larger fragment sizes or a larger RANGE of fragment sizes?
I look forward to any suggestions people may have. Thanks in advance.
Comment