Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • MG-RAST - header format problem?

    Hi everyone,

    after successfully using MG-RAST with assembled data, I am trying to use it with raw fastq illumina sequences. I originally had two fastq files (forward and reverse paired ends); I merged them and uploaded the file.

    I now receive an error message:
    Warning: The unique id count does not match the sequence count. You will not be able to use this file for submission.

    Basically the unique id count is half the number of sequences.
    My reads are ordered as forward and reverse with the following format:

    @HWUSI-EAS1700R:25:FC:6:1:12466:1106 1:Y:0:TTAGGC

    and

    @HWUSI-EAS1700R:25:FC:6:1:12466:1106 2:Y:0:TTAGGC

    My guess is that I may need to modify the header. Any suggestion?

    Thanks
    Max

    - Edit: I should be able to modify the header by myself (I know a little bit of Python), but I am not sure if that is the problem and what my header should be.
    Thanks again
    Max
    Last edited by mstagliamonte; 06-18-2013, 06:12 AM. Reason: Clarify my request

  • #2
    Only the first part of the header is being used to identify the read,
    Just replace the space with a "_" or other character.


    instead of;
    @HWUSI-EAS1700R:25:FC:6:1:12466:1106 1:Y:0:TTAGGC

    have

    @HWUSI-EAS1700R:25:FC:6:1:12466:1106_1:Y:0:TTAGGC

    Something like

    Code:
    sed 'e/\ /\_/g' seqfile > seqfile_ed

    Comment


    • #3
      Thanks,

      I'll try immediately and let you know.

      Regards,
      Max

      Comment


      • #4
        sorry,

        Code:
        sed 's/\ /\_/'

        Comment


        • #5
          Hahaha,

          I noticed

          I was not able to fix it, so I've just started running my python script.

          Let's see how it goes

          Comment


          • #6
            Hi, Ciaran,

            many thanks for your advice, it worked.

            Have a nice day
            Max

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin




              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
              Yesterday, 07:01 AM
            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            59 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            57 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            48 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            55 views
            0 likes
            Last Post seqadmin  
            Working...
            X