Seqanswers Leaderboard Ad

**annavilborg** · 06-17-2013, 09:12 AM

Hi again,

I have come up with a possible solution, but if anyone had some input to give I would very much appreciate it. If I did bedtools coverage to get the antisense coverage for each of my genes, and then sorted the resulting file on the coverage value? And then extracted everything in the first x lines? Could that work?

I am unsure how to do the sorting and extracting. "sort" and "grep"? Any suggestions on the exact "phrasing"?

Thank you!

**jparsons** · 06-17-2013, 12:37 PM

Once you have a coverage file, you can sort based on the result with "sort". the -n flag will sort numbers numerically (eg: 10<20<100 instead of 10<100<11<20) and the -kX flag will allow you to choose which key/column you wish to do the actual sort on. -k 4 would choose the 4th. -r would allow you to reverse the output to get the largest numbers first instead of last.

I would suggest "head -X <filename>" to extract the first X lines, although i imagine grep could also do it relatively easily.

**annavilborg** · 06-21-2013, 09:27 AM

This worked very well! Thank you!

**hanshart** · 06-24-2013, 08:09 PM

Originally posted by annavilborg View Post

Hi,
... provided that there are more than a certain number of reads mapping to that gene...
Thank you.

Hi annavilborg,
you can additionally pipe your sorted output to awk to get only those lines (genes) with read count above your threshold. Figure out which column (field) stores the read counts and then e.g. for field 4 and Threshold 100 apply something like:

Code:

bedtools coverage ... | awk '{if ($4>100) print $0}' | sort -nr -k 4

$4 is the 4th field
100 the read count threshold
print $0 prints the content of the whole line
sort -nr -k 4 will then numerically, reverse sort all these gene lines with readcount above threshold based on the corresponding column (4).

**annavilborg** · 06-26-2013, 05:23 PM

Neat! Thank you!

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 24 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 25 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 21 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Extracting features above a certain coverage

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News