Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How do I automate the graphing of these data?

    Hi, everybody,

    I have result files generated by blastn which then were sorted based on the second field. A typical file looks like:

    360 miR156a
    1 miR156a
    9 miR156a
    1 miR156a
    10 miR156a
    7 miR156a
    1 miR156a
    705 miR157a
    2 miR157a
    1 miR157a
    5 miR157a
    4 miR157a
    67 miR157a
    5 miR157a
    11 miR157a
    2 miR157a
    34 miR159
    3 miR162
    3 miR166a
    17 miR166a
    4 miR166a
    103 miR167a
    1 miR167a
    ... .....

    The first column is the deepseq read counts for each unique sequence. The 2nd column is the miR IDs that the sequence was aligns to.
    I would like to:
    1)
    Sum the total read counts for each miR IDs (e.g. for miR156a, sum row1-row7);
    Generate a bar graph to show the total read counts for each miR ID.


    I have more than 20 files like this. I would like to use an automated way of doing this. The R package came to my minds.
    But I have not used R before. Can you guys give me some tips or suggestions as about which R package or tools to use? (I can then learn those and figure out)


    2)
    If possible, generate a table that summarize all the total reads info from the 20 files.
    The table that I would like to have is as follows:

    miRID sample1 sample2 sample3 ......... sample 20
    miR156 103 300 450 .......... 33
    miR157 205 300 ..........
    miR167 .....
    .... .......


    Thanks a lot!!

    Jian
    Last edited by yangjianhunt; 06-29-2012, 09:14 AM.

  • #2
    For 1), the bar plot part is easy in R; just use barplot() !

    Summing the counts can be done in a lot of different ways. Here is one that is maybe a bit cryptic but will teach you the table() command. Assume you have the table you pasted in a text file called mirna.txt. Try to run the following in R, with the mirna.txt file in the current working directory:

    m <- read.table("mirna.txt")
    q <- table(m)
    totcounts <- as.numeric(rownames(q)) %*% q
    barplot(totcounts)

    There are of course more transparent ways of summing the counts, but I'm too lazy to type them out :-)

    Comment


    • #3
      Thanks a lot, kopi-o.

      This looks awesome. I will try it out.

      Jian

      Comment


      • #4
        solved

        I eventually used:
        list.files () function to get all the files
        lapply () to achieve processing for multiple functions.
        read.table () to read data.frame from each file
        tapply (SeqCounts, miRNA, sum) to get a counting for each "class"
        write.table () to write data into a file, append=TRUE
        also used paste() and cat () to write a name before each appendage.
        barplot () to draw polt

        It took me a couple of days to learn the introductory basics of R. But it was fun and will be useful in the future I hope.

        Again, thanks to Kopi-o for point the way: I haven't learned how to used the table () function yet...But I feel confident to be able to learn it now.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM
        • seqadmin
          Techniques and Challenges in Conservation Genomics
          by seqadmin



          The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

          Avian Conservation
          Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
          03-08-2024, 10:41 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 06:37 PM
        0 responses
        8 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, Yesterday, 06:07 PM
        0 responses
        8 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-22-2024, 10:03 AM
        0 responses
        49 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-21-2024, 07:32 AM
        0 responses
        67 views
        0 likes
        Last Post seqadmin  
        Working...
        X