I'm currently woking on the assembly of a diploid eukaryotic genome using a combination of PacBio Sequel subreads and Illumina NextSeq paired-end reads from a single individual. Prior to this, we had generated some paired-end NextSeq data from a separate individual of the same species for some short-read assembly work and also have some RNA-Seq (again from a different individual) that we were using for other purposes.
My main question is whether I can 'safely' use any of this older data for the purpose of scaffolding my current NextSeq+Sequel assembly using tools such as Rascaf in the case of the RNA-Seq reads or Multi-CAR in the case of the old NextSeq reads. If so, I'd probably only use one or the other dataset-- a second individual is one thing, but three I feel is perhaps a bit much.
I know that people do this (since these tools exist and since I've seen them used before), but was wondering if anyone else has had experience with this that they'd be willing to share. My main concern pertains to factors such as transposable elements, individual-specific sequence inversions/insertions/deletions, etc., that could result in erroneous scaffolding.
My main question is whether I can 'safely' use any of this older data for the purpose of scaffolding my current NextSeq+Sequel assembly using tools such as Rascaf in the case of the RNA-Seq reads or Multi-CAR in the case of the old NextSeq reads. If so, I'd probably only use one or the other dataset-- a second individual is one thing, but three I feel is perhaps a bit much.
I know that people do this (since these tools exist and since I've seen them used before), but was wondering if anyone else has had experience with this that they'd be willing to share. My main concern pertains to factors such as transposable elements, individual-specific sequence inversions/insertions/deletions, etc., that could result in erroneous scaffolding.