![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
How to run script (csh file) in Glimmer | Votinhkiem90 | Bioinformatics | 10 | 03-26-2018 09:55 PM |
Sam to Bam using bowtie and using the shell script | AnushaC | Bioinformatics | 7 | 11-01-2013 05:17 PM |
Issue with Sam-Bam conversion samtools - how to remove last line of Sam file? | TabeaK | Bioinformatics | 3 | 11-19-2012 11:05 AM |
script help to format gff file | Kennels | Bioinformatics | 4 | 06-14-2012 12:00 AM |
HTSeq Script from DEXSeq Reports Assertion Fail in SAM file | FuzzyCoder | Bioinformatics | 5 | 09-27-2011 09:52 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Berlin Join Date: Jul 2013
Posts: 20
|
![]()
Hi everyone!
In this attached example, i would like to know how many times the element "bta-miR-191" and "bta-miR-10" in the header of a SAM file is repeated in the rest of the document (in this case: 6 and 5 respectively). Could you give me an idea for an script based on this example? Thanks |
![]() |
![]() |
![]() |
#2 |
Member
Location: California Join Date: Dec 2010
Posts: 21
|
![]()
grep -v '@SQ' example.txt | awk '{if ($3=="bta-miR-191") print}' | wc -l
grep -v '@SQ' example.txt | awk '{if ($3=="bta-miR-10") print}' | wc -l |
![]() |
![]() |
![]() |
#3 |
Senior Member
Location: Cambridge, UK Join Date: May 2010
Posts: 311
|
![]()
Hello- To expand a bit shoegame2001's answer... If your input file is big you might want to read thorough it only once and count the two (or more) patterns at the same time:
Code:
grep -v '@SQ' example.txt \ | awk '{if ($3=="bta-miR-191") mir191+=1; else if ($3 == "bta-miR-10") mir10+=1} END {print "bta-miR-191:" mir191, "\nbta-miR-10:", mir10}' |
![]() |
![]() |
![]() |
#4 |
Member
Location: Berlin Join Date: Jul 2013
Posts: 20
|
![]()
thanks, the grep function was very usufull to remove the headers.
After doing that, i did a count ot he repited elements on column 3. cat myfile-noheaders.out | awk '{print$3}' | sort | uniq -c | sort -rnk1 >SAM-counts.out I hope this can be helpfull for somebody else! |
![]() |
![]() |
![]() |
#5 |
Member
Location: Berlin Join Date: Jul 2013
Posts: 20
|
![]()
if it is usefull, here the complete script to remove the headers ('@') and then count repeated terms in column 3 of the SAM file:
grep -v '@' myfile.out | awk '{print$3}' | sort | uniq -c | sort -rnk1 >SAM-counts.out Cheeers |
![]() |
![]() |
![]() |
Thread Tools | |
|
|