Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to get hg19.fa?

    I am trying to download a reference genome hg19 from UCSC site.
    I tried to convert hg19.2bit to hg19.fa by twoBitToFa on UCSC tools.
    It said "cannot execute binary file".
    then I tried
    "cat chr1.fa chr2.fa chr3.fa chr4.fa chr5.fa chr6.fa chr7.fa chr8.fa chr9.fa chr10.fa chr11.fa chr12.fa chr13.fa chr14.fa chr15.fa chr16.fa chr17.fa chr18.fa chr19.fa chr20.fa chr21.fa chr22.fa chrX.fa chrY.fa >hg19.fa"
    But when I used this hg19.fa in bwa, it said
    "[bwa_index] fail to open file 'hg19.fa'. Abort!
    Aborted"

    I am still not able to get any reference sequence build.

    Note: I am using Linux and am a beginner.
    Last edited by ninad; 08-18-2011, 09:51 PM.

  • #2
    I actually did the exact same command for generating my hg19.fa file and it worked perfectly...
    Are you sure you put the right path of the file as argument for bwa?

    Comment


    • #3
      Yes, in fact I am in the same directory. Here is the command,
      "bwa index -a bwtsw -p hg19_bwa hg19.fa "

      Otherwise, have you used the twoBitToFa of UCSC?
      Is there any other source for this file?

      Comment


      • #4
        Just a recommendation, I would not use the "alternative name prefix" option -p . It helps to keep it simple. Also, in the following commands you will typically have to specify the base file (hg19.fa) which then points at all the other aligner-specific indices (in the same directory).

        So this should be sufficient:

        Code:
        bwa index -a bwtsw hg19.fa

        Comment


        • #5
          Thanks sdvie for the suggestion.

          Some1 please help asap. Can anyone upload a genome on rapidshare and share the link? I mean sounds ridiculous, but is there any other sophisticated way?

          Or please help about twoBitToFa usage.

          Comment


          • #6
            You can get the utility program TwoBitToFa from here:

            http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/

            Once you downloaded it, you must change permissions first to allow it to be executed as a program.

            Then you execute it from a terminal:

            without arguments to see the options:

            Code:
            $ /path/to/twoBitToFa
            
            twoBitToFa - Convert all or part of .2bit file to fasta
            usage:
               twoBitToFa input.2bit output.fa
            options:
               -seq=name - restrict this to just one sequence
               -start=X  - start at given position in sequence (zero-based)
               -end=X - end at given position in sequence (non-inclusive)
               -seqList=file - file containing list of the desired sequence names 
                                in the format seqSpec[:start-end], e.g. chr1 or chr1:0-189
                                where coordinates are half-open zero-based, i.e. [start,end)
               -noMask - convert sequence to all upper case
               -bpt=index.bpt - use bpt index instead of built in one
               -bed=input.bed - grab sequences specified by input.bed. Will exclude introns
            
            Sequence and range may also be specified as part of the input
            file name using the syntax:
                  /path/input.2bit:name
               or
                  /path/input.2bit:name
               or
                  /path/input.2bit:name:start-end
            You will only need to execute the simple command:

            Code:
            $ /path/to/twoBitToFa /path/to/hg19.2bit /path/to/hg19.fa
            Good luck.

            Comment


            • #7
              Thanks sdvie. But I had already gone through these steps. unfortunately, my linux is i686 and not x86_64.
              Thats why it could not execute the binary I suppose.

              Now, I only have the option of "cat chr*.fa", which did not work.
              I am still stuck to obtain human reference genome hg19!!!

              Comment


              • #8
                Just download it from here:
                hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/chromFa.tar.gz
                them
                tar -zxvf chromFa.tar.gz
                and
                cat chr*.fa > hg19.fa
                Done!
                Last edited by raonyguimaraes; 08-19-2011, 01:58 AM.

                Comment


                • #9
                  Thanks raonyguimaraes,
                  but I already tried that and it is still giving the same problem.
                  ~/chromFa$ cat *.fa >hg19s.fa
                  command:
                  ~/chromFa$ bwa index -a bwtsw hg19s.fa
                  Then this was output:
                  [bwa_index] fail to open file 'hg19s.fa'. Abort!
                  Aborted
                  This is not working right....

                  Comment


                  • #10
                    Found a similar question http://seqanswers.com/forums/archive...hp/t-5236.html

                    I have no idea whats going on... Check the version of your bwa

                    Comment


                    • #11
                      unprobable, but: enough memory available?

                      Comment


                      • #12
                        Could be http://seqanswers.com/forums/archive...p/t-10766.html

                        Try to use a small file, index only the chr1.fa

                        Comment


                        • #13
                          I tried to reproduce the error but it worked perfectly with new downloaded chr*.fa files...

                          Is it possible that you have no read permissions on the file?
                          My bwa version is 0.5.9-r16

                          Besides that I can't think of anything else...

                          Comment


                          • #14
                            I went through this even earlier, but I am not getting segmentatioon fault. I also have the latest version of bwa 0.5.7.

                            I am clueless too.

                            Comment


                            • #15
                              @peter
                              I will try using the mentioned specifications by you Peter. my bwa is 0.5.7 (r1310)

                              @raonyguimaraes-
                              I have tried using chr1.fa file and it works perfectly fine.
                              My hg19.fa file is 3 GB big. So considering the bwa indexing limit of 4 GB, it should still work.

                              I have checked permissions for the file, they are perfectly fine.
                              I believe the problem is either in concatenation or size limit. I have used cat in both above mentioned ways and its now resolved.
                              I dont know what can be any other problem.

                              @sdvie -How to check whether enough memory is available or not?

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Recent Innovations in Spatial Biology
                                by seqadmin


                                Spatial biology is an exciting field that encompasses a wide range of techniques and technologies aimed at mapping the organization and interactions of various biomolecules in their native environments. As this area of research progresses, new tools and methodologies are being introduced, accompanied by efforts to establish benchmarking standards and drive technological innovation.

                                3D Genomics
                                While spatial biology often involves studying proteins and RNAs in their...
                                01-01-2025, 07:30 PM
                              • seqadmin
                                Advancing Precision Medicine for Rare Diseases in Children
                                by seqadmin




                                Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                                12-16-2024, 07:57 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 01-09-2025, 04:04 PM
                              0 responses
                              431 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 01-09-2025, 09:42 AM
                              0 responses
                              440 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 01-08-2025, 03:17 PM
                              0 responses
                              452 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 01-03-2025, 11:18 AM
                              1 response
                              50 views
                              1 like
                              Last Post Tonia
                              by Tonia
                               
                              Working...
                              X