SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Reply
 
Thread Tools
Old 01-26-2017, 09:56 AM   #1
clarissaboschi
Member
 
Location: US

Join Date: Apr 2010
Posts: 62
Default CrossMap analysis

I am trying to convert some genome coordinates of SNPs in one vcf file using the CrossMap program (http://crossmap.sourceforge.net/).

I am having difficulties with the conversion. My vcf file does not have all chromosomes present in the chain file (for example chromosome random, etc), so the conversion is not performed.

Do I need to remove all chromosomes not present in my vcf file from the chain file? (and maybe from the fasta file as well?)

My error message was:
KeyError: "sequence 'chr10_NT_461738v1_random' not present"

This sequence is present only in the chain file, but I am not sure if I should edit the chain file.

Also, Is the chromosome format need to be the same in the 3 files (input, chain and fasta file): Chr1, chr1 or 1?

thanks
clarissaboschi is offline   Reply With Quote
Old 01-30-2017, 06:16 AM   #2
liguow
Member
 
Location: Houston

Join Date: Apr 2009
Posts: 12
Default

Quote:
Originally Posted by clarissaboschi View Post
I am trying to convert some genome coordinates of SNPs in one vcf file using the CrossMap program (http://crossmap.sourceforge.net/).

I am having difficulties with the conversion. My vcf file does not have all chromosomes present in the chain file (for example chromosome random, etc), so the conversion is not performed.

Do I need to remove all chromosomes not present in my vcf file from the chain file? (and maybe from the fasta file as well?)

My error message was:
KeyError: "sequence 'chr10_NT_461738v1_random' not present"

This sequence is present only in the chain file, but I am not sure if I should edit the chain file.

Also, Is the chromosome format need to be the same in the 3 files (input, chain and fasta file): Chr1, chr1 or 1?

thanks
The error message "sequence 'chr10_NT_461738v1_random' not present" was not issued by CrossMap itself, it could be issued by its dependent package like pysam.

My guess is "chr10_NT_461738v1_random" presents in your VCF file, but absent from your reference FASTA file.
liguow is offline   Reply With Quote
Old 01-30-2017, 07:01 AM   #3
clarissaboschi
Member
 
Location: US

Join Date: Apr 2010
Posts: 62
Default

Ok, thanks I will check it. I tried by using bed file format and it worked very well.
clarissaboschi is offline   Reply With Quote
Old 01-30-2017, 10:54 AM   #4
liguow
Member
 
Location: Houston

Join Date: Apr 2009
Posts: 12
Default

Quote:
Originally Posted by clarissaboschi View Post
Ok, thanks I will check it. I tried by using bed file format and it worked very well.
It further confirms my hypothesis. Pysam tried to retrieve the reference allele from FASTA file, it reported this error message when it failed to find
'chr10_NT_461738v1_random'.
liguow is offline   Reply With Quote
Reply

Tags
coordinates, crossmap, genome

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:23 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO