Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bowtie question about yeast SK1

    Hello all,
    As many of the posts on the site indicate, I am new to the world of bioinformatics. I want to do alignment of my ChIP-seq data to the SK1 genome in yeast.
    First, which genome file contains the SK1 sequence? I have searched this using Google, to no avail. Is this sequence contained in the folder that you can download from the bowtie website?
    Second, once I have this file, how do I load it as my reference genome?
    I'm certain that more questions are to come, and appreciate any assistance I can receive!

  • #2
    Originally posted by Feeling ChIPper View Post
    Hello all,
    As many of the posts on the site indicate, I am new to the world of bioinformatics. I want to do alignment of my ChIP-seq data to the SK1 genome in yeast.
    First, which genome file contains the SK1 sequence? I have searched this using Google, to no avail. Is this sequence contained in the folder that you can download from the bowtie website?
    Second, once I have this file, how do I load it as my reference genome?
    I'm certain that more questions are to come, and appreciate any assistance I can receive!
    A version of the revised SK1 genome is available here: http://cbio.mskcc.org/public/SocciN/SK1_MvO/V1/ You will want to check the "Readme" file. The file you need is the "fasta" sequence file. There is an annotation file (GFF) that you will want to get for future use.

    Another version seems to be available from here: http://steinmetzlab.embl.de/SK1/

    Since this is not the commonly used strain you will need to create the indexes for aligner you choose to use. Instructions for creating the index files are available for:

    bwa: http://bio-bwa.sourceforge.net/bwa.shtml
    bowtie2: http://bowtie-bio.sourceforge.net/bo...-build-indexer

    Here is a link to get comprehensive/basic information about NGS alignments: http://en.wikibooks.org/wiki/Next_Ge...S%29/Alignment
    ChIP-seq for ENCODE: http://www.ncbi.nlm.nih.gov/pubmed/22955991
    Last edited by GenoMax; 08-14-2013, 05:00 PM.

    Comment


    • #3
      Thanks for the reply. I loaded the sequence from the Steinmetz lab, but when I follow the instructions to load it I get an error message. Here is the copy and paste:

      ./bowtie-build sk1.fa SK1
      Settings:
      Output files: "SK1.*.ebwt"
      Line rate: 6 (line is 64 bytes)
      Lines per side: 1 (side is 64 bytes)
      Offset rate: 5 (one in 32)
      FTable chars: 10
      Strings: unpacked
      Max bucket size: default
      Max bucket size, sqrt multiplier: default
      Max bucket size, len divisor: 4
      Difference-cover sample period: 1024
      Endianness: little
      Actual local endianness: little
      Sanity checking: disabled
      Assertions: disabled
      Random seed: 0
      Sizeofs: void*:4, int:4, long:4, size_t:4
      Input files DNA, FASTA:
      sk1.fa
      Error: could not open sk1.fa
      Total time for call to driver() for forward index: 00:00:00
      Command: ./bowtie-build sk1.fa SK1

      I put the .fa file initially in the 'Indexes' folder, then moved it to 'Genomes' both of which resulted in failure to open the .fa.
      I will try again with the other genome assemblies you suggested.
      Thanks!

      Comment


      • #4
        It may be better to run the bowtie-build command by entering the directory where your sk1.fa file is (where you would want to keep the indexes for future access).

        A simplified version would be:

        Code:
        $ cd /to_directory _containing_fasta_file
        $ /full_directory_path_to/bowtie-build ./sk1.fa SK1
        Patience would be needed to leave this process alone as it runs. Depending on specs of the computer you are using it may take an hour for the indexes to build (or longer).

        Are you a biologist learning how to use the command line? Getting a good intro to Unix will save you a lot of aches as you step into world of command line. One such guide is here: http://korflab.ucdavis.edu/Unix_and_...rl_v3.1.1.html

        Comment


        • #5
          I am a biologist learning command line. I learned a lot last night, and have been accessing the websites for commands. It seems that my file cannot be read into bowtie, maybe due to the fact that input files need to be comma delimited and mine are not. I will try to do this by compiling different files for each chromosome, which will all be output into an SK1 index file. Maybe this work around will be successful...

          Comment


          • #6
            Originally posted by Feeling ChIPper View Post
            I am a biologist learning command line. I learned a lot last night, and have been accessing the websites for commands. It seems that my file cannot be read into bowtie, maybe due to the fact that input files need to be comma delimited and mine are not. I will try to do this by compiling different files for each chromosome, which will all be output into an SK1 index file. Maybe this work around will be successful...
            Genome files are generally provided as multi-Fasta format files (see example here: http://en.wikipedia.org/wiki/FASTA_format). The download from the two genome sources should already be in that format. Not sure why you are trying to separate the chromosomes or convert them into comma delimited files.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM
            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 03-27-2024, 06:37 PM
            0 responses
            12 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-27-2024, 06:07 PM
            0 responses
            11 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-22-2024, 10:03 AM
            0 responses
            53 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-21-2024, 07:32 AM
            0 responses
            69 views
            0 likes
            Last Post seqadmin  
            Working...
            X