View Single Post
Old 12-19-2016, 05:30 AM   #1
visse226
Junior Member
 
Location: The Netherlands

Join Date: Nov 2016
Posts: 9
Talking Add 'missing' lines of data by using python code

So I am a beginner when it comes to programming and python and such. But I think I have a very simple question.

I have large tab-delimited files that for example contain lines like this:

10000 7
20000 1
30000 2
60000 3

What I want to have, is a file that also contains the 'missing' lines, such as this:

10000 7
20000 1
30000 2
40000 0
50000 0
60000 3

The files are rather large as I am working with whole genome sequence data. The first column is basically a position in the genome and the second column is the number of SNPs I find within that 10kb window. However, I don't think this information is even relevant, I just want to write a simple python code that will add these lines to the file by using if else statements.

So if the position does not match the position of the previous line + 10000, the 'missing line' is written, otherwise the normal occurring line is written.

I just foresee one problem in this, namely when several lines in a row are missing (as in my example). Does anyone have a smart solution for this simple problem?

Many thanks!
visse226 is offline   Reply With Quote