Hi
I have a pair of tumor/normal exome data (Illumina, paired-end) and would like to call somatic variations like mutation, indel, ... etc.
However, read 2 of the dataset is of poor quality and lead to ...
1. A mapping rate just half of the rate of read 1.
2. More artificial interchromosome read pairs
What bothers us a lot is the dramatic reduction of "effective coverage" (and thus insufficient) since many Variation Callers seem to selectively ignore some read pairs.
We've tried some softwares and have a guess from the result that ...
1. CASAVA considers only good read pairs
2. VarScan considers only good read pairs
3. GATK considers both good read pairs and singletons but not interchromosome read pairs.
It seems that GATK is the program that make the best use of all reads so far.
Does anyone have advice on how we could further rescue the coverage or what tools/methods(view it as single-end read?) we can try with this dataset ?
Any advice would be greatly appreciated.
I have a pair of tumor/normal exome data (Illumina, paired-end) and would like to call somatic variations like mutation, indel, ... etc.
However, read 2 of the dataset is of poor quality and lead to ...
1. A mapping rate just half of the rate of read 1.
2. More artificial interchromosome read pairs
What bothers us a lot is the dramatic reduction of "effective coverage" (and thus insufficient) since many Variation Callers seem to selectively ignore some read pairs.
We've tried some softwares and have a guess from the result that ...
1. CASAVA considers only good read pairs
2. VarScan considers only good read pairs
3. GATK considers both good read pairs and singletons but not interchromosome read pairs.
It seems that GATK is the program that make the best use of all reads so far.
Does anyone have advice on how we could further rescue the coverage or what tools/methods(view it as single-end read?) we can try with this dataset ?
Any advice would be greatly appreciated.