Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • A program to extract the reads and modify the seq ID by adding weight

    Hi everyone,

    I have a problem in executing the perl script (found online) is given below, a script t0 compare 2 files

    1) a file with seq IDs and its weight
    2) a file with seq IDs and the sequences.

    I modified the original script a bit and tried to use the code with my data,but it neither prints out the output nor gives out any errors and further I want to add the weights in the file 1 to the sequence ID after comparing and extracting the respective reads.

    Input files and the script are attached.

    expected output:-

    >comp10003_c0_seq1 len=166 path=[748:0-22 1004:23-46 2527:47-165]_weight=41
    AAGTAGCCTATGCGCTACAGTAAGAAAGACAGGTGAAAAAATGGAAGTAAAACAATTAGA
    TGACTACTTTGGATATACAGAAAAGGGCAGTTCCTTAGAGGGGGAATTACGAGCAGGACT
    AACGACATTCTTGACAATGGCGTACATTCTGTTTGTGAACCCAGAC


    Could anyone please help me out.

    Thank you in advance.
    Attached Files

  • #2
    Your script is pulling in the sample_IDs with the '>' attached as well as the count. It then pulls in the sample_reads without the '>' attached. The program thus can not match up sample_IDs with sample_reads. So there are two problems here -- (1) you are not saving the counts and (2) you can not match up IDs.

    The solution is to re-write the part where you have

    $ids{$_} += 1;

    Let us know you want more of a hint than that.

    Comment


    • #3
      Does it mean that I have to create a hash of Ids or?

      Comment


      • #4
        Yes, create the hash of IDs. You need to do two things:

        1) Remove the '>'
        2) Split out the counts from the read name and save the counts as the values in your hash.

        Comment


        • #5
          Can you please help me how to proceed further to fulfill the steps you mentioned as I am not a very good programmer

          Comment


          • #6
            The best way to become a better program is to experiment with your programs. :-)

            That said, I would change the line:

            $ids{$_} += 1;

            To

            my ($id, $count) = $_ =~ /^>*(\S+)\s+(\d+)/;
            $ids{$id} = $count;

            Note: I did not test the above. Basically you are taking the input line and looking for:
            1) '>' (optional)
            2) Characters (the id)
            3) Whitespace
            4) Digits (the count)
            And then putting the id and count into your %ids hash

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Recent Advances in Sequencing Analysis Tools
              by seqadmin


              The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
              Today, 07:48 AM
            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin




              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
              04-22-2024, 07:01 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Today, 07:17 AM
            0 responses
            11 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 05-02-2024, 08:06 AM
            0 responses
            19 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-30-2024, 12:17 PM
            0 responses
            20 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-29-2024, 10:49 AM
            0 responses
            28 views
            0 likes
            Last Post seqadmin  
            Working...
            X