Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • ashkot
    Member
    • Nov 2011
    • 59

    Reference FASTA file for sequencing use

    Hi all,
    I downloaded FASTA files from NCBI and tried to use them in my sequencing pipeline. The issue is that within the FASTA file the sequences were represented using their genbank accession and due to this the genbank accession also appreared in the VCF file which is undesirable.

    Is there a way to prepare a FASTA file so that chromosome number is used or rather is there a way to "prepare" a FASTA file for sequencing use?

    Thank in advance.
  • GenoMax
    Senior Member
    • Feb 2008
    • 7142

    #2
    What "genome" is this?

    You can get pre-formatted sequence, annotation and index files for a number of common organisms at the iGenomes site: http://support.illumina.com/sequenci...e/igenome.html

    Comment

    • ashkot
      Member
      • Nov 2011
      • 59

      #3
      That site has only up to hg18 for human but if I wanted to use much older genome assemblies? And my work is for human genome analysis.

      Comment

      • GenoMax
        Senior Member
        • Feb 2008
        • 7142

        #4
        I would suggest that you get older data from from UCSC: http://hgdownload.soe.ucsc.edu/downloads.html#human Look for "chromFa.zip" files in the "Full Dataset" links.

        Comment

        • ashkot
          Member
          • Nov 2011
          • 59

          #5
          We did take those file and there is no issue with those files. There are some other old genomes which we require and that is the issue.

          I am wondering if there is some code that can standardize sequence names across fasta file?

          Comment

          • GenoMax
            Senior Member
            • Feb 2008
            • 7142

            #6
            You could manually change the fasta headers to suite your purposes and remake the indexes.

            Comment

            Latest Articles

            Collapse

            • GATTACAT
              Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
              by GATTACAT
              Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
              07-01-2026, 11:43 AM
            • SEQadmin2
              Nine Things a Sample Prep Scientist Thinks About Before Sequencing
              by SEQadmin2


              I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

              Here are nine questions we think about, in roughly the order they matter, before...
              06-18-2026, 07:11 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by SEQadmin2, Yesterday, 11:08 AM
            0 responses
            6 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-30-2026, 05:37 AM
            0 responses
            11 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-26-2026, 11:10 AM
            0 responses
            19 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-17-2026, 06:09 AM
            0 responses
            53 views
            0 reactions
            Last Post SEQadmin2  
            Working...