SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   QTL overlap (bedtools) (http://seqanswers.com/forums/showthread.php?t=69484)

clarissaboschi 06-01-2016 05:10 AM

QTL overlap (bedtools)
 
Dear all, I am trying to compare some QTL regions with specific genome regions that I have, for it I am using bedtools.
I used before bedtools with a list of genes and it worked well, but this time I am getting the error message: "***** ERROR: too many digits/characters for integer conversion in string . Exiting...".

My input files are:
QTL file (with about 5,000 regions):
Chr1 241009 241109 GROW
Chr1 241009 241109 PLASCO
Chr1 2095900 3104645 BW
Chr1 2329958 2400142 AF
Chr1 2329958 2400142 BW


and my genomic regions input (with about 700 regions):
Chr1 18340001 18360000
Chr1 18350001 18370000
Chr1 18570001 18590000
Chr1 18960001 18980000
Chr1 18970001 18990000
Chr1 18980001 19000000

I tried different command lines, and I got the same error every time (and some times I had a output until the chromosome 8, and sometimes no result):
Code:

intersectBed -c -a QTLfile.bed  -b regionFile.bed > outfile
I am not sure what is the problem, I checked my files and it seems that they are ok....I tried also to increase the memory and split the chromosomes, and I also had the same error..

thanks

colindaven 06-02-2016 09:02 AM

Sounds like an error with your input files. MAybe the separators are wrong on one line.

Try reading both files into R / Calc etc and doing some summary statistics. You'll probably find that there is a space not a tab on one line.

clarissaboschi 06-02-2016 09:54 AM

Thanks Colindaven, yes I was thinking that the issue is on my QTL file that I got from a database.
I already found different issues in this file, like start position bigger than end position and also lines with no positions, but I will do more checks! :)

pengchy 06-30-2016 02:10 AM

Quote:

Originally Posted by clarissaboschi (Post 195299)
Thanks Colindaven, yes I was thinking that the issue is on my QTL file that I got from a database.
I already found different issues in this file, like start position bigger than end position and also lines with no positions, but I will do more checks! :)

Hi, the problem is caused by the redundancy of the QTL file. There are many line with the same start and end position, if you using the abbreviation as ID, there will be all the same, just like the following lines:

Code:

1      7232667 7273886 DRIPL  +
1      7232667 7273886 DRIPL  +

you can use "sort -u" to remove the redundancy.

Another reason is the NULL position, like the following line:
Code:

Chr.10  Animal QTLdb    Meat_and_Carcass_Association                    .      .      .      QTL_ID=65998;Name=Juiciness score;Abbrev=JUICE;PUBMED_ID=22297614;trait_ID=65;trait=Juiciness score;breed=yorkshire;FlankMarkers;PTO_name=meat juiciness;Map_Type=Linkage;Model=Mendelian;Test_Base=-;peak_cM=73.1;Significance=Significant;P-value=0.0433;gene_ID=100517235;gene_IDsrc=NCBIgene
Just remove these lines.


All times are GMT -8. The time now is 10:13 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.