I download the GTF (gene model annotations) file from:
ftp://ftp.ensembl.org/pub/current/gt...Ch37.59.gtf.gz
I run tophat with the option -G, the Bowtie index is H. sapiens, NCBI v37. Download from ftp://ftp.cbcb.umd.edu/pub/data/bowt...7_asm.ebwt.zip
Because the chromosome names in the gene model annotations must match the names in the Bowtie index. So I use this sed script to convert the chromosome name in GTF to match the names in the Bowtie index. Is it right? Thanks.
ftp://ftp.ensembl.org/pub/current/gt...Ch37.59.gtf.gz
I run tophat with the option -G, the Bowtie index is H. sapiens, NCBI v37. Download from ftp://ftp.cbcb.umd.edu/pub/data/bowt...7_asm.ebwt.zip
Because the chromosome names in the gene model annotations must match the names in the Bowtie index. So I use this sed script to convert the chromosome name in GTF to match the names in the Bowtie index. Is it right? Thanks.
Code:
s/^1\t/gi|224589800|ref|NC_000001.10|\t/g; s/^10\t/gi|224589801|ref|NC_000010.10|\t/g; s/^11\t/gi|224589802|ref|NC_000011.9|\t/g; s/^12\t/gi|224589803|ref|NC_000012.11|\t/g; s/^13\t/gi|224589804|ref|NC_000013.10|\t/g; s/^14\t/gi|224589805|ref|NC_000014.8|\t/g; s/^15\t/gi|224589806|ref|NC_000015.9|\t/g; s/^16\t/gi|224589807|ref|NC_000016.9|\t/g; s/^17\t/gi|224589808|ref|NC_000017.10|\t/g; s/^18\t/gi|224589809|ref|NC_000018.9|\t/g; s/^19\t/gi|224589810|ref|NC_000019.9|\t/g; s/^20\t/gi|224589812|ref|NC_000020.10|\t/g; s/^21\t/gi|224589813|ref|NC_000021.8|\t/g; s/^22\t/gi|224589814|ref|NC_000022.10|\t/g; s/^2\t/gi|224589811|ref|NC_000002.11|\t/g; s/^3\t/gi|224589815|ref|NC_000003.11|\t/g; s/^4\t/gi|224589816|ref|NC_000004.11|\t/g; s/^5\t/gi|224589817|ref|NC_000005.9|\t/g; s/^6\t/gi|224589818|ref|NC_000006.11|\t/g; s/^7\t/gi|224589819|ref|NC_000007.13|\t/g; s/^8\t/gi|224589820|ref|NC_000008.10|\t/g; s/^9\t/gi|224589821|ref|NC_000009.11|\t/g; s/^X\t/gi|224589822|ref|NC_000023.10|\t/g; s/^Y\t/gi|224589823|ref|NC_000024.9|\t/g; s/^MT\t/gi|17981852|ref|NC_001807.4|\t/g;
Comment