Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Microarray data (bedGraph) to metaplot.

    Hi all,

    I am running into troubles with the microarray data I am trying to analyze. Lets try to explain this clearly. I have two sets of genes for which Iam trying to find differences in histone modifications. For most of the histone modifications, I was able to get pretty elaborate .bam files and making the metaplots was easy-peazy (attach3).

    For some of the older data published, such as from microarray data, the process is less straight forward. I converted the microarray data to .bed files and after this to .bam files to analyze them with the R metagene package. For most .bam files this package works great. However because there is no read count available but only intensity the metaplot package does not give nice outputs.

    Here is an example of the .bedgraph file used to make .bam (added a mockID)
    chrnumber; start; end; normalized intensity
    Chr1 25 50 0.005
    Chr1 60 85 0.001
    Chr1 113 138 0.001
    Chr1 154 179 0.359
    Chr1 185 210 0.001
    Chr1 219 244 0.004
    Chr1 254 279 4.599
    Chr1 287 312 3.908

    And this is how the .bam files look
    id-1 0 Chr1 26 255 25M * 0 0 * *
    id-2 0 Chr1 61 255 25M * 0 0 * *
    id-3 0 Chr1 114 255 25M * 0 0 * *
    id-4 0 Chr1 155 255 25M * 0 0 * *


    I added an attachment to view the output from the metagene package. When I view these .bedgraph file in IGV for example the curves look really nice for the histone marks, compared to the .bam files generated from the .bedgraphs. This is most likely also the reason why the metagene package is not able to plot my data well.

    I tried plotting the .bedGraphs with another package called metagene-maker but this program gives me IndexError: list index out of range. I think this error is caused because most of the reads are not in my designated .bed files with the regions of the genes I want to map. It would take quite some effort to do this manually and this is probably not the way to go. I was thinking about giving the .bam file some mock read count, and use the intensity as the mapping quality but most likely this is not a great idea from a bioinformatics view, and could give some problems upon publication.

    I am just wondering what the way forward would be from a bioinformaticians view as other complete .bam files give beautifull output.

    So summing it up; I have .bed, .bam and .bedGraph files from microarray data; location and intensity of predesigned probes mapped to genome. Want to know which is the best way to make metaplots of this data against .bed files with self-defined regions (in .bed format).

    Help would be greatly appreciated!
    R
    Attached Files

  • #2
    Tricky. Perhaps its not possible.

    However, you can do something similar I believe with deeptools (see the installation in the usegalaxy.eu server for example) if you convert from bedgraph to bigwig.

    Thats a bit of a mission in itself, but java-genomics-toolkit can help you do that.

    Best of luck

    Comment


    • #3
      Thank you for your reply. In the end I decided to use the bed files to make scorematrices with the library("genomation") package in R. These could be plotted with plotMeta function and statistics could be done with ks.test on colmeans from scorematrices.

      Deeptools works as well but this would have been more effort. Good luck to anyone in the future struggling with this.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM
      • seqadmin
        Techniques and Challenges in Conservation Genomics
        by seqadmin



        The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

        Avian Conservation
        Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
        03-08-2024, 10:41 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Yesterday, 06:37 PM
      0 responses
      8 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, Yesterday, 06:07 PM
      0 responses
      8 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-22-2024, 10:03 AM
      0 responses
      49 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-21-2024, 07:32 AM
      0 responses
      66 views
      0 likes
      Last Post seqadmin  
      Working...
      X