i want to convert BAM to BigWig for ensembl reference genome (Homo_sapiens.GRCh38.81.fa to be specific) using the following work flow.
Workflow:
# 1. Convert SAM to BAM
samtools view -S -b -o sample.bam sample.sam
# 2. Sort the BAM file
samtools sort sample.bam sample.sorted
# 3. Create BedGraph coverage file
genomeCoverageBed -bg -ibam sample.sorted.bam -g chromsizes.txt > sample.bedgraph
# 4. Convert the BedGraph file to BigWig
bedGraphToBigWig sample.bedgraph chromsizes.txt sample.bw
However, I am not sure how to get chromsizes.txt.
One suggested in a forum in biostars https://www.biostars.org/p/97890/ is by by running samtools faidx on the file and look at the .fai file.
i used this command
samtools faidx Homo_sapiens.GRCh38.81.fa
But this doesn't seem to work for me as I got .fai file with this output and it doesn't make any sense.
0 1657 58 60 61
1 632 1837 60 61
2 1351 2634 60 61
3 68 4042 60 61
4 712 4170 60 61
5 535 4940 60 61
6 138 5518 60 61
7 1187 5717 60 61
8 590 6970 60 61
9 840 7604 60 61
10 940 8493 60 61
11 918 9484 60 61
Workflow:
# 1. Convert SAM to BAM
samtools view -S -b -o sample.bam sample.sam
# 2. Sort the BAM file
samtools sort sample.bam sample.sorted
# 3. Create BedGraph coverage file
genomeCoverageBed -bg -ibam sample.sorted.bam -g chromsizes.txt > sample.bedgraph
# 4. Convert the BedGraph file to BigWig
bedGraphToBigWig sample.bedgraph chromsizes.txt sample.bw
However, I am not sure how to get chromsizes.txt.
One suggested in a forum in biostars https://www.biostars.org/p/97890/ is by by running samtools faidx on the file and look at the .fai file.
i used this command
samtools faidx Homo_sapiens.GRCh38.81.fa
But this doesn't seem to work for me as I got .fai file with this output and it doesn't make any sense.
0 1657 58 60 61
1 632 1837 60 61
2 1351 2634 60 61
3 68 4042 60 61
4 712 4170 60 61
5 535 4940 60 61
6 138 5518 60 61
7 1187 5717 60 61
8 590 6970 60 61
9 840 7604 60 61
10 940 8493 60 61
11 918 9484 60 61
Comment