SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
No coverage or very low coverage in the Complete Genomics data raman91 Bioinformatics 5 01-10-2018 11:42 PM
coverage per position - less positions than in reference boetsie Bioinformatics 0 05-26-2011 03:43 AM
low 454 coverage combined with high solexa coverage strob Bioinformatics 7 10-07-2010 11:14 AM
Quality trimmming / Mask low quality bases? bbimber Bioinformatics 9 03-25-2010 02:40 PM

Reply
 
Thread Tools
Old 12-10-2018, 06:03 AM   #1
chelie
Junior Member
 
Location: Germany

Join Date: Sep 2015
Posts: 3
Default how to mask low coverage positions

Hello,

I am looking for a way to mask positions below a certain coverage, i.e., convert all positions below x-fold coverage to N.

The problem is not calculating the coverage for each position (I can do that with BEDtools genomecov or BBmap pileup.sh), but the actual conversion of the input file. I was looking into BEDtools maskfasta for example, but couldn't find a way to set a specific coverage-threshold.

Thanks for any suggestion!
chelie is offline   Reply With Quote
Old 01-02-2019, 01:55 AM   #2
ashishbansal
Junior Member
 
Location: Pune

Join Date: Dec 2018
Posts: 3
Default

Try this code to mask coverage positions

layout(local_size_x = 1, local_size_y = 1) in;
writeonly uniform uimage2D img_output;

struct LineSegment
{
vec2 mP1;
vec2 mP2;
};

layout(std140) uniform GlyphData
{
int mTextureOffsetX; // Target X location in atlas
int mTextureOffsetY; // Target Y location in atlas
int mWidth; // Width of glyph
int mHeight; // Height of glyph
vec2 mBorderUnits; // Amount of border to preserve
int mScanlineStartIndices[100]; // Start into the sorted line segment array

}glyphData;

layout(std140) uniform OutlineData
{
vec2 mSize; // Size of glyph in units
vec2 mMinimum; // Minimum of glyph in units
int mNumSortedOutlines; // Amount of outlines
vec2 mSortedOutlinesP1[1000]; // Top point of outlines
vec2 mSortedOutlinesP2[1000]; // Bottom point of outlines

}outlineData;

void main()
{
uint result = 0;

ivec2 pixelCoords = ivec2(gl_GlobalInvocationID.xy);
int startIndex = glyphData.mScanlineStartIndices[pixelCoords.y];

if (startIndex != -1)
{
float yUnits = float(pixelCoords.y) / float(glyphData.mHeight);
float nextYUnits = float(pixelCoords.y + 1) / float(glyphData.mHeight);

vec2 sizeWithBorder = outlineData.mSize + glyphData.mBorderUnits * 2.0f;
vec2 minimum = outlineData.mMinimum - glyphData.mBorderUnits;

for (int index = startIndex; index != outlineData.mNumSortedOutlines; ++index)
{
LineSegment line;
line.mP1 = (outlineData.mSortedOutlinesP1[index] - minimum) / sizeWithBorder;
line.mP2 = (outlineData.mSortedOutlinesP2[index] - minimum) / sizeWithBorder;

if (line.mP1.y >= nextYUnits)
break;

float dy = line.mP2.y - line.mP1.y;
if (dy == 0.0)
continue;

float dx = line.mP2.x - line.mP1.x;
float dx_dy = dx / dy;

for (int subY = 0; subY != 4; ++subY)
{
for (int subX = 0; subX != 4; ++subX)
{
float subPixelYOffset = float(3 - subX) * (1.0 / 16.0) + (1.0 / 32.0) + float(subY) * 0.25f;
float subPixelXOffset = float(subY) * (1.0 / 16.0) + (1.0 / 32.0) + float(subX) * 0.25f;

float sampleY = yUnits + subPixelYOffset / float(glyphData.mHeight);
if (sampleY >= line.mP1.y && sampleY < line.mP2.y)
{
float lineXUnits = line.mP1.x + (sampleY - line.mP1.y) * dx_dy;
float curX = float(pixelCoords.x + subPixelXOffset) / float(glyphData.mWidth);
if (curX > lineXUnits)
{
uint mask = 1 << ((3 - subX) + subY * 4);
result ^= mask;
}
}
}
}
}
}

ivec2 textureOffset = ivec2(pixelCoords.x + glyphData.mTextureOffsetX, pixelCoords.y + glyphData.mTextureOffsetY);
imageStore(img_output, textureOffset, uvec4(result, 0, 0, 0));
}
ashishbansal is offline   Reply With Quote
Old 01-02-2019, 05:59 AM   #3
HESmith
Senior Member
 
Location: Bethesda MD

Join Date: Oct 2009
Posts: 502
Default

You're on the right track with BEDtools. Use the 'genomecov' command with the '-bg' flag (BEDGRAPH format) to calculate coverage, filter using 'awk' (awk '$4 < THRESHOLD_VALUE') to retain the low-coverage intervals, then back to BEDtools 'maskfasta' to convert those intervals to Ns.

Last edited by HESmith; 01-02-2019 at 06:41 AM.
HESmith is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:54 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO