Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • perl script for fetching lines, if a pattern is given

    I am very new to this field.I have two file (few lines of files are given below), first one is tab delimited text file contains contig numbers and respective annotations given in the same line . second file i am having contains only ids. I need to fetch annotations from the first file for the ids present in second file.May you please provide with a perl script.

    first file:-
    contig06610 Q9SQV1 GO:0000184 GO:0006139 GO:0006139 nucleobase, nucleoside, nucleotideandnucleicacidmetabolicprocess biological_process
    contig07217 B0S733 GO:0000184 GO:0006139 GO:0006139 nucleobase, nucleoside, nucleotideandnucleicacidmetabolicprocess biological_process
    contig08155 O13828 GO:0000184 GO:0006139 GO:0006139 nucleobase, nucleoside, nucleotideandnucleicacidmetabolicprocess biological_process
    contig10605 O01510 GO:0000184 GO:0006139 GO:0006139 nucleobase, nucleoside, nucleotideandnucleicacidmetabolicprocess biological_process
    contig12296 Q9FJR0 GO:0000184 GO:0006139 GO:0006139 nucleobase, nucleoside, nucleotideandnucleicacidmetabolicprocess biological_process


    second file:-
    contig07217
    contig07217
    contig10605
    contig12296

  • #2
    Forget perl, just use grep (grep -f file_with_ids file_to_search).

    Comment


    • #3
      Originally posted by dpryan View Post
      Forget perl, just use grep (grep -f file_with_ids file_to_search).
      Didn't even realize we could do that with grep. It just takes each line as an additional pattern?

      Then after grep, you'd have a list of lines in stdout like

      Code:
      contig06610	Q9SQV1	GO:0000184	GO:0006139	GO:0006139	nucleobase, nucleoside, nucleotideandnucleicacidmetabolicprocess
      From there, awk to extract particular columns you wanted. I presume columns 3,4 and 5 for Gene Ontology?
      Last edited by ctseto; 11-14-2013, 12:11 PM.

      Comment


      • #4
        for id in `cat second_file.txt`;do grep -w $id first_file.txt;done
        Even "Join" should be fine if both files are sorted by contig name.

        Comment


        • #5
          Thanks a lot Ryan, ctseto and vivek. .

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Advancing Precision Medicine for Rare Diseases in Children
            by seqadmin




            Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
            12-16-2024, 07:57 AM
          • seqadmin
            Recent Advances in Sequencing Technologies
            by seqadmin



            Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

            Long-Read Sequencing
            Long-read sequencing has seen remarkable advancements,...
            12-02-2024, 01:49 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Today, 01:35 PM
          0 responses
          7 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 12-17-2024, 10:28 AM
          0 responses
          39 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 12-13-2024, 08:24 AM
          0 responses
          52 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 12-12-2024, 07:41 AM
          0 responses
          38 views
          0 likes
          Last Post seqadmin  
          Working...
          X