Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Visualize unionBedGraphs file in UCSC

    Dear all,

    I have generated bedgraphs that I can visualize in UCSC. For example:
    Code:
    head BIO_31-T1-ARN.bedGraph
    track type=bedGraph name=BIO_31-T1-ARN
    chr5	170000000	170000600	0.00
    chr5	170000600	170000700	2.00
    chr5	170000700	170000850	0.00
    chr5	170000850	170000950	2.00
    chr5	170000950	170001000	0.00
    chr5	170001000	170001100	2.00
    chr5	170001100	170001250	0.00
    chr5	170001250	170001400	2.00
    chr5	170001400	170002200	0.00
    I want to compare bam of samples from 2 conditions by visualizing them on UCSC. In one condition, I have 5 samples and in the other condition, there are 16 samples. I was told that it is possible to visualize only one "summarizing sample" per condition.
    I ran unionBedGraphs from bedtools on the 5 samples of condition 1:
    unionBedGraphs -i BIO-12-T1-ARN/tophat_out/BIO-12-T1-ARN.bedGraph BIO_26-T1-ARN/tophat_out/BIO_26-T1-ARN.bedGraph BIO_28-T1-ARN/tophat_out/BIO_28-T1-ARN.bedGraph BIO_052_T1_ARN/tophat_out/BIO_052_T1_ARN.bedGraph BIO_057_T1_ARN/tophat_out/BIO_057_T1_ARN.bedGraph -header > Cond1.bedGraph
    I got a merged file of 1.4Go:
    Code:
    head Cond1.bedGraph
    chrom	start	end
    chr11	60000000	60008500	0	0	0	0	0.00
    chr11	60008500	60008600	0	0	0	0	1.00
    chr11	60008600	60009850	0	0	0	0	0.00
    chr11	60009850	60010050	0	0	0	0	2.00
    chr11	60010050	60010400	0	0	0	0	0.00
    chr11	60010400	60010450	0	0	0	0	1.00
    chr11	60010450	60016500	0	0	0	0	0.00
    chr11	60016500	60016600	0	0	0	0	1.00
    chr11	60016600	60047950	0	0	0	0	0.00
    When I try to upload this file in UCSC, I get the following error:
    Error File 'Cond1.bedGraph' - Unrecognized format line 1 of file: chrom start end
    I changed the header line with:
    track type=bedGraph name=Cond1
    but I got the following error:
    needMem: trying to allocate 1414098818 bytes (limit: 500000000)
    Then, I took only the 500,000 first lines of Cond1.bedGraph with "track type=bedGraph name=Cond1" as Header, but again, I got an error:
    Error File 'Cond1_500000.bedGraph' - Error line 2 of custom track: Expecting + or - in strand
    Finally I tried another header:
    Code:
    head Cond1_500000_v3.bedGraph
    chrom	start	end	S1	S2	S3	S4	S5
    chr11	60000000	60008500	0	0	0	0	0.00
    chr11	60008500	60008600	0	0	0	0	1.00
    chr11	60008600	60009850	0	0	0	0	0.00
    chr11	60009850	60010050	0	0	0	0	2.00
    chr11	60010050	60010400	0	0	0	0	0.00
    chr11	60010400	60010450	0	0	0	0	1.00
    chr11	60010450	60016500	0	0	0	0	0.00
    chr11	60016500	60016600	0	0	0	0	1.00
    chr11	60016600	60047950	0	0	0	0	0.00
    ending with:
    Error File Cond1_500000_v3.bedGraph' - Unrecognized format line 1 of file: chrom start end S1 S2 S3 S4 S5 (note: chrom names are case sensitive, e.g.: correct: 'chr1', incorrect: 'Chr1', incorrect: '1')
    Now, I wonder if it is possible to visualize this merged file in UCSC.
    Can you please tell me if it is possible to visualize it somehow? Should I modify something in the header?

    Thank you for your help,
    Jane
    Last edited by Jane M; 11-25-2016, 01:45 AM.

Latest Articles

Collapse

  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin




    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
    04-22-2024, 07:01 AM
  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Yesterday, 08:47 AM
0 responses
12 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
60 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
60 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
54 views
0 likes
Last Post seqadmin  
Working...
X