Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • litali
    Member
    • Jul 2010
    • 78

    singeltons+contigs for 454 data

    Hi,
    I am interested to create one file which will include the singeltons and contigs together, Is there any way to create such a file using 454 softwares or do I have to have a script which extracts the names of the singeltons from the readstatus file and then extracts the sequences from the fasta file and adds them to the contigs file?
    Thanks alot!!!
  • Torst
    Senior Member
    • Apr 2008
    • 275

    #2
    I don't know of any script. I am replying to remind you that 454 sometimes uses PARTS of a read (the "left" end) and then puts the "right" end in 454ReadStatus.txt with the name "XXXXXX_right Singleton", so you'll need to think about what you want to do with those. eg.

    % grep Singleton 454ReadStatus.txt
    GHFU8EI02CHPMJ_left Singleton
    GHFU8EI02B9FPY Singleton
    GHFU8EI02CJNN4_right Singleton
    GHFU8EI02CA0E9 Singleton


    To get the .FASTA sequences from the .SFF file, you'll need to use "sffinfo":

    % sffinfo -seq file.sff > file.fasta

    Also, if you did paired end sequencing, the 454Scaffolds.fna file does NOT CONTAIN those contigs in 454Contigs.fna which failed to scaffold.

    Comment

    • westerman
      Rick Westerman
      • Jun 2008
      • 1104

      #3
      I thought that _left and _right should arise from paired end reads and not from split reads.

      Basically what I do is to:

      1) Grab the reads of choice from the 454ReadStatus.txt file and, optionally, the 454TrimStatus file

      2) Use sfffile to create a temporary sff file with just those reads

      3) Use sffinfo to extract the sequences.

      The rough steps are:

      fgrep '\tSingleton' 454ReadStatus.txt > /tmp/Singleton.tmp

      sfffile -o /tmp/Singleton.sff /tmp/Singleton.tmp mysff.sff

      sffinfo -s /tmp/Singleton.sff > Singleton.tfa

      Comment

      • flxlex
        Moderator
        • Nov 2008
        • 412

        #4
        Originally posted by Torst View Post
        Also, if you did paired end sequencing, the 454Scaffolds.fna file does NOT CONTAIN those contigs in 454Contigs.fna which failed to scaffold.
        That is not entirely true: 454Scaffolds.txt contains the scaffolds (at least two contigs with gap(s)) AND all unscaffolded contigs of at least 2kb. IMO they shouldn't have done that, but rather outputted a separate unscaffolded-contig file...

        Originally posted by westerman View Post
        sfffile -o /tmp/Singleton.sff /tmp/Singleton.tmp mysff.sff
        I guess you mean

        Code:
        sfffile -o /tmp/Singleton.sff -i /tmp/Singleton.tmp mysff.sff
        (note the '-i')

        Comment

        • Torst
          Senior Member
          • Apr 2008
          • 275

          #5
          flxlex,

          Originally posted by flxlex View Post
          That is not entirely true: 454Scaffolds.txt contains the scaffolds (at least two contigs with gap(s)) AND all unscaffolded contigs of at least 2kb. IMO they shouldn't have done that, but rather outputted a separate unscaffolded-contig file...
          Hmm, it appears you are correct. Thank you for replying! I had not noticed the "1 contig scaffolds" because, like you said, it is inconsistent and they get renamed to "scaffoldNNNNNN" ... but yes, when I examine 454Scaffolds.txt I can see many scaffolds which are made up of 1 contig only.. I find it hard to accept they would use a different threshold for "contigs becoming scaffolds" and "large contigs", and NOT output the separate unscaffolded contigs file too.

          Also, you suggest the cut-off is 2kbp, but in my example 10 of the 22 contigs are between 1356bp and 1870bp, which suggests maybe the cutoff is 1kbp?

          Either way - thank you muchly for catching my error!

          Comment

          • westerman
            Rick Westerman
            • Jun 2008
            • 1104

            #6
            Originally posted by flxlex View Post

            I guess you mean

            Code:
            sfffile -o /tmp/Singleton.sff -i /tmp/Singleton.tmp mysff.sff
            (note the '-i')
            Yes, that is what I get for pulling the code out of a script that I use instead of typing it in directly. Thanks for the correction.

            Comment

            Latest Articles

            Collapse

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by SEQadmin2, 06-05-2026, 10:09 AM
            0 responses
            13 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-04-2026, 08:59 AM
            0 responses
            24 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-02-2026, 12:03 PM
            0 responses
            28 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-02-2026, 11:40 AM
            0 responses
            22 views
            0 reactions
            Last Post SEQadmin2  
            Working...