Hello everybody,
I've received an "old project" for analysis, where I should try to filter some background signal from the sequencing, that is supposed to come from a contamination during the pre-sequencing stages (a technically complex isolation procedure, as far as I was told) . The DNA to analyze comes from some kind of vacuolar vesicles that are expected to carry DNA and RNA. The isolation procedure yielded Nuclear DNA (apparently from a male, since Y chromosome sequences appear also in the female sample) apart from the DNA in the vesicles (that was supposed to be sequenced exclusively).
After mapping the fastq sequences to the reference genome, I displayed it on SeqMonk and then the problem became more visually obvious. There is a background signal that covers the whole reference genome with mapped regions, although there are some places where signal increases (I´d like to think it's due to the DNA from the vesicles, but I seems it could just be a mere sequencing artifact).
My intention was trying to filter the background generated from the Nuclear DNA contaminant somehow, and then display those regions of interest(the vesicles DNA).
The question is, do you think this filtering may be possible or even worthy, or should I tell the researchers that there's no chance to "save" their sequence data for an accurate analysis?
If any of you has any idea on how to filter that "contamination" sequence data, would you please enlighten me.
With best wishes
JL
I've received an "old project" for analysis, where I should try to filter some background signal from the sequencing, that is supposed to come from a contamination during the pre-sequencing stages (a technically complex isolation procedure, as far as I was told) . The DNA to analyze comes from some kind of vacuolar vesicles that are expected to carry DNA and RNA. The isolation procedure yielded Nuclear DNA (apparently from a male, since Y chromosome sequences appear also in the female sample) apart from the DNA in the vesicles (that was supposed to be sequenced exclusively).
After mapping the fastq sequences to the reference genome, I displayed it on SeqMonk and then the problem became more visually obvious. There is a background signal that covers the whole reference genome with mapped regions, although there are some places where signal increases (I´d like to think it's due to the DNA from the vesicles, but I seems it could just be a mere sequencing artifact).
My intention was trying to filter the background generated from the Nuclear DNA contaminant somehow, and then display those regions of interest(the vesicles DNA).
The question is, do you think this filtering may be possible or even worthy, or should I tell the researchers that there's no chance to "save" their sequence data for an accurate analysis?
If any of you has any idea on how to filter that "contamination" sequence data, would you please enlighten me.
With best wishes
JL
Comment