Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • from excel file to genomic features

    Hi, everyone,

    I'm totally new to RNA-seq. I have collaborated with other lab to sequence one Chip-seq library. Now I have some half-analyzed tags data in excel file
    the column title like this, totally about 1000 rows:
    Chr CenterMeter SummedHeight ChrStart ChrEnd.


    I'm wondering how I start from these data to generate the genomic feature of these tags?

    1 how to translate xls files to bed or other files?
    2 how I can start from the tags chromosomal location files to get while tags genomic features distribution, like Transcription units, CpG islands, repeats, et.al.

    I can use a lot bit of R language not a expert.

    Thanks a lot for your help!

  • #2
    1. The BED format is simple:

    You can write that with write.table in R. You can read an Excel sheet with e.g. read.xls from the gdata package.

    2. Could you re-write this question more clearly? If you mean that you want to know how to compare the genomic locations with known genomic features, you can use e.g. intersectBed from BEDtools, if you have the features in a GFF or BED file.

    Comment


    • #3
      HI, Arvid, Thanks a lot for your help.

      I only have the tags' genomic location in excel files, like chr 3, 1398765. I want to get the genomic features of these location. like are these tags prefer clustering at CpGi sland or transcription units or other regions ?

      Thanks a lot!

      Comment


      • #4
        You should be able to use the "table browser" from UCSC (http://genome.ucsc.edu) to get the genomic features info.

        See the help page here: http://genome.ucsc.edu/goldenPath/he...ablesHelp.html

        Tutorials available here: http://www.openhelix.com/cgi/tutorialInfo.cgi?id=28

        Comment


        • #5
          hi, I guess I didn't give enough information. I have a file like this:

          chrom chromStart chromEnd
          chr13 38021582 38023245
          chr11 22920116 22921002
          chr15 91737068 91740707
          chr18 48206683 48207725
          chr9 78326184 78327354
          chr10 57693494 57697038
          chrX 71688149 71689220
          chrX 130684504 130685256
          chr4 149618312 149620001
          chr11 3031766 3033033
          chr1 193808626 193809612
          chr13 94710571 94711334
          chr9 64025568 64026410
          chr17 45705704 45706771
          ......
          ......

          My question is how I can get the genomic features of these location?

          I'm pretty new to bioinformatics. A little bit of more detail will be much appreciated.

          Thanks a lot!

          ps: thanks GenoMax, I tried your link, however I don't know where to input the csv file into the genome browser. looks they need a table for identifiers (names/accessions) table.

          Comment


          • #6
            Originally posted by ohsu View Post
            hi, I guess I didn't give enough information. I have a file like this:

            chrom chromStart chromEnd
            chr13 38021582 38023245
            chr11 22920116 22921002
            ......

            My question is how I can get the genomic features of these location?
            Re-format that into a BED file (a text file containing that same data, with the column separated by tabs). Then get a GFF with the genomic features you would like to compare to, making sure that the chromosome naming is the same. Then get the overlap of the two files with intersectBed from BEDTools, with the following command:

            Code:
            intersectBed -a your_chr_loc.bed -b genomic_features.gff > overlapping_features.gff
            You can then load the GFF into a spreadsheet or into a genome browsing tool like IGV...

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM
            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 06:37 PM
            0 responses
            12 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 06:07 PM
            0 responses
            10 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-22-2024, 10:03 AM
            0 responses
            51 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-21-2024, 07:32 AM
            0 responses
            68 views
            0 likes
            Last Post seqadmin  
            Working...
            X