Seqanswers Leaderboard Ad

**annavilborg** · 06-17-2013, 09:12 AM

Hi again,

I have come up with a possible solution, but if anyone had some input to give I would very much appreciate it. If I did bedtools coverage to get the antisense coverage for each of my genes, and then sorted the resulting file on the coverage value? And then extracted everything in the first x lines? Could that work?

I am unsure how to do the sorting and extracting. "sort" and "grep"? Any suggestions on the exact "phrasing"?

Thank you!

**jparsons** · 06-17-2013, 12:37 PM

Once you have a coverage file, you can sort based on the result with "sort". the -n flag will sort numbers numerically (eg: 10<20<100 instead of 10<100<11<20) and the -kX flag will allow you to choose which key/column you wish to do the actual sort on. -k 4 would choose the 4th. -r would allow you to reverse the output to get the largest numbers first instead of last.

I would suggest "head -X <filename>" to extract the first X lines, although i imagine grep could also do it relatively easily.

**annavilborg** · 06-21-2013, 09:27 AM

This worked very well! Thank you!

**hanshart** · 06-24-2013, 08:09 PM

Originally posted by annavilborg View Post

Hi,
... provided that there are more than a certain number of reads mapping to that gene...
Thank you.

Hi annavilborg,
you can additionally pipe your sorted output to awk to get only those lines (genes) with read count above your threshold. Figure out which column (field) stores the read counts and then e.g. for field 4 and Threshold 100 apply something like:

Code:

bedtools coverage ... | awk '{if ($4>100) print $0}' | sort -nr -k 4

$4 is the 4th field
100 the read count threshold
print $0 prints the content of the whole line
sort -nr -k 4 will then numerically, reverse sort all these gene lines with readcount above threshold based on the corresponding column (4).

**annavilborg** · 06-26-2013, 05:23 PM

Neat! Thank you!

Topics	Statistics	Last Post
Evaluating Genome Sequencing for ECMO Patients in the NICU by seqadmin Started by seqadmin, 12-17-2024, 10:28 AM	0 responses 33 views 0 likes	Last Post by seqadmin 12-17-2024, 10:28 AM
New Genetic Toolkit Refines Studies on Gene Function and Disease by seqadmin Started by seqadmin, 12-13-2024, 08:24 AM	0 responses 48 views 0 likes	Last Post by seqadmin 12-13-2024, 08:24 AM
Study Links Brain Mechanism to Emotional Responses in Animals and Humans by seqadmin Started by seqadmin, 12-12-2024, 07:41 AM	0 responses 34 views 0 likes	Last Post by seqadmin 12-12-2024, 07:41 AM
Study Identifies Ribosomal RNA Fingerprints as Early Cancer Biomarkers by seqadmin Started by seqadmin, 12-11-2024, 07:45 AM	0 responses 46 views 0 likes	Last Post by seqadmin 12-11-2024, 07:45 AM

Seqanswers Leaderboard Ad

Announcement

Extracting features above a certain coverage

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News