Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • newbie desperate and confused

    Hi all,

    I am completely new to sequencing.
    I am a computer science student but I am working on a bioinformatics project on whole genome functional annotation.

    My data is in csfasta format.
    How do I change this to fasta format?
    I am also very confused..what is the difference between the F3.csfasta file and the F5.csfasta file?

    Additionally, I have been told that the data is in clc format..what does this mean?

    Does anyone know of any good tools to do whole genome functional annotations?

    I am extremely desperate and very very confused. Any information would be very much appreciated.

    Thank you.

  • #2
    Hope this helps

    Comment


    • #3
      I'll hazard a guess that your csfasta files came from an ABI SOLiD sequencer so you need to do some further analysis, e.g. using the tools in ABI Bioscope package to find the SNVs and INDELs in your data. The output will then be in GFF format. Google can find a pdf of the Bioscope manual.
      Once you've got your list of SNVs and filtered out the low quality reads, an easy way to annotate them is to feed the list into the online SeattleSeq Annotation tool.

      Comment


      • #4
        Read up on genome annotation first.

        Then read about SOLiD sequencing.

        You have SOLiD sequenced reads. F3 and F5 are the pairs of a mate pair library.

        The clc files are for a bioinformatics software called CLC Bio. They are probably the mapping .sam files that has been converted to clc format.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin




          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
          04-22-2024, 07:01 AM
        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Today, 08:47 AM
        0 responses
        11 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        60 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        59 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        54 views
        0 likes
        Last Post seqadmin  
        Working...
        X