View Single Post
Old 06-20-2013, 07:42 PM   #3
sophiespo
Member
 
Location: australia

Join Date: Apr 2013
Posts: 15
Default

Thanks for replying. I think I have narrowed down the problem a little - though I am still not sure what is causing it.

I looked at the file in the command line and excel, looked at my interval file, they all match. They seem fine.

So I did:

head -1000 input.coverage > new.coverage

and then in R
read.coverage.gatk(new.coverage)

and it worked.

so then I did head -2000 and it gave a warning:
In matrix(as.integer(unlist(strsplit(chrpos[, 2], "-"))), ncol = 2, :
data length [3999] is not a sub-multiple or multiple of the number of rows [2000]

When I scroll up through the output I see this:

1998 chr1:13330308-13331001 chrchr1 13331001 13331242 242
1999 chr1:13331242-13331852 chrchr1 13331852 13351395 19544
2000 chr1:13351395-13351876 chrchr1 13351876 11943 -13339932

The value 11943 is coming up in the probe_end column for any length of file above about 1900 lines. Of course it is throwing it off. But I can't figure out WHY it is always putting 11943 in that column?

(I am sorry, I don't know how to show this as code)

Please help if anyone has any ideas??

UPDATE:

THANKYOU bruce01 - you have helped me solve my problem.

I just found that one of my lines looks like this:

chr1:11918179-11918777 30252 50.50 30252 50.50 18 38 87 76.6
chr1:11918785 17 17.00 17 17.00 18 18 18 100.0
chr1:11918787-11918928 3060 21.55 3060 21.55 16 20 32 79.6

I will edit and se how it goes.

Last edited by sophiespo; 06-20-2013 at 08:04 PM.
sophiespo is offline   Reply With Quote