Hi all,
First let me briefly describe what I am trying to do. I have Illumina reads (ezRAD and Illumina MiSeq), from which I want to identify putative neutral and adaptive SNPs. Here's what I have done so far:
- trimmed the reads for adaptors and quality with FASTQC toolkit
- inported the reads into GALAXY
- groomed the reads so that they were converted from phred+64 to phred+33
- mapped these reads onto my reference genome (L. gigantea) with BWA-MEM with default parameters
- cleaned the output BAM files with CleanSAM in SAMtools using the strict option
- filtered the BAM files to remove alignments with MAPQ < 15, using Filter SAM tool under SAMtools in Galaxy
- Merged my BAM files so all of my samples were combined, using merge BAM tool under SAMtools in Galaxy, with default parameters
- performed an Mpileup on the merged BAM file using SAMtools, where I did not perform genotype likelihood computation, with the same reference genome and basic parameters
After performing the Mpileup, I got a pileup output file that looks like this (only showing the 1st two lines):
1 2 3 4 5 6 7 8
LOTGIsca_18122 322 T 0 185
LOTGIsca_18122 323 A 0 186
Now I am very confused, as I thought pileup output files were either 5 or 10 columns, and I have 8... Also, I tried to filter the pileup output file with VARSCAN, using default parameters, and I get an output file with 0 lines.
Any input on what I have done, or what I should rather do instead would be greatly appreciated.
Thanks!
Erica
First let me briefly describe what I am trying to do. I have Illumina reads (ezRAD and Illumina MiSeq), from which I want to identify putative neutral and adaptive SNPs. Here's what I have done so far:
- trimmed the reads for adaptors and quality with FASTQC toolkit
- inported the reads into GALAXY
- groomed the reads so that they were converted from phred+64 to phred+33
- mapped these reads onto my reference genome (L. gigantea) with BWA-MEM with default parameters
- cleaned the output BAM files with CleanSAM in SAMtools using the strict option
- filtered the BAM files to remove alignments with MAPQ < 15, using Filter SAM tool under SAMtools in Galaxy
- Merged my BAM files so all of my samples were combined, using merge BAM tool under SAMtools in Galaxy, with default parameters
- performed an Mpileup on the merged BAM file using SAMtools, where I did not perform genotype likelihood computation, with the same reference genome and basic parameters
After performing the Mpileup, I got a pileup output file that looks like this (only showing the 1st two lines):
1 2 3 4 5 6 7 8
LOTGIsca_18122 322 T 0 185
LOTGIsca_18122 323 A 0 186
Now I am very confused, as I thought pileup output files were either 5 or 10 columns, and I have 8... Also, I tried to filter the pileup output file with VARSCAN, using default parameters, and I get an output file with 0 lines.
Any input on what I have done, or what I should rather do instead would be greatly appreciated.
Thanks!
Erica
Comment