Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Extracting specific lines/rows with awk

    I have to extract the line/row which begins with 1 10515 15143 128 2....and so on. The file is huge and doesn't begin with 1 in all the cases.

    Can anyone tell me how to extract with awk or with some other command.
    Thanks


    HWI-ST0764:99:C0BV6ACXX:4:1204:6906:61789 83 1 15143 0 = 14831 -406 Library0
    HWI-ST0764:99:C0BV6ACXX:4:2107:4589:43565 83 1 15176 41 = 14914 -362 Library0
    ---------------------------------------------
    1 15015 15143 128 2 3.00995 >Deletion_xxx_00000000<

    HWI-ST0764:99:C0BV6ACXX:4:2301:8245:154739 83 1 16419 16 = 16094 -425 Library0
    HWI-ST0764:99:C0BV6ACXX:4:1108:15472:66215 99 1 16249 15 = 16540 391 Library0
    HWI-ST0764:99:C0BV6ACXX:4:1108:3056:153621 1123 1 16249 15 = 16558 391 Library0
    ---------------------------------------------
    1 16350 16419 69 3 15.3084 >Deletion_xxx_00000001<

    HWI-ST0764:99:C0BV6ACXX:4:2302:16721:121399 83 1 69682 12 = 69383 -399 Library0
    HWI-ST0764:99:C0BV6ACXX:4:2204:19244:40995 83 1 69785 36 = 69523 -362 Library0
    HWI-ST0764:99:C0BV6ACXX:4:1101:7617:39485 83 1 69797 29 = 69536 -361 Library0
    ---------------------------------------------
    1 69637 69682 45 3 16.6685 >Deletion_xxx_00000002<

    HWI-ST0764:99:C0BV6ACXX:4:1204:14771:27644 99 1 367948 0 = 368253 369 Library0
    HWI-ST0764:99:C0BV6ACXX:4:2103:1214:124930 83 1 368340 0 = 368023 -417 Library0
    ---------------------------------------------
    1 368124 368253 129 2 -0 >Deletion_xxx_00000003<

    HWI-ST0764:99:C0BV6ACXX:4:2306:18464:168765 83 1 802031 57 = 801722 -403 Library0
    HWI-ST0764:99:C0BV6ACXX:4:2202:17216:184485 99 1 801733 37 = 802023 369 Library0
    HWI-ST0764:99:C0BV6ACXX:4:1102:11251:109896 83 1 802054 57 = 801737 -408 Library0
    HWI-ST0764:99:C0BV6ACXX:4:2101:20094:111155 83 1 802052 57 = 801755 -397 Library0
    HWI-ST0764:99:C0BV6ACXX:4:2304:14851:200592 83 1 802033 60 = 801771 -362 Library0
    HWI-ST0764:99:C0BV6ACXX:4:2204:16036:62667 99 1 801789 37 = 802080 372 Library0
    HWI-ST0764:99:C0BV6ACXX:4:1307:14444:68097 99 1 801800 57 = 802117 403 Library0
    HWI-ST0764:99:C0BV6ACXX:4:2206:4525:22206 83 1 802136 0 = 801822 -399 Library0
    HWI-ST0764:99:C0BV6ACXX:4:2106:3993:7815 83 1 802110 57 = 801830 -379 Library0
    HWI-ST0764:99:C0BV6ACXX:4:1204:5671:93720 1107 1 802121 57 = 801830 -379 Library0
    HWI-ST0764:99:C0BV6ACXX:4:1205:12390:55233 99 1 801835 37 = 802137 386 Library0
    HWI-ST0764:99:C0BV6ACXX:4:2103:16357:92085 83 1 802136 7 = 801839 -390 Library0
    HWI-ST0764:99:C0BV6ACXX:4:2306:16690:178455 83 1 802141 7 = 801847 -374 Library0
    HWI-ST0764:99:C0BV6ACXX:4:2304:8284:19653 99 1 801852 37 = 802140 381 Library0
    HWI-ST0764:99:C0BV6ACXX:4:2304:18678:164778 99 1 801853 37 = 802136 376 Library0
    HWI-ST0764:99:C0BV6ACXX:4:2106:6319:57635 83 1 802150 5 = 801859 -387 Library0
    HWI-ST0764:99:C0BV6ACXX:4:2304:5415:115448 99 1 801860 37 = 802143 382 Library0
    ---------------------------------------------
    1 801961 802023 62 17 9.95808 >Deletion_xxx_00000004<

  • #2
    Extracting specific lines/rows with awk

    How many different patterns/lines are you trying to extract?

    Grep will also work:
    $grep 'pattern' in.txt > out.txt

    check
    $grep --help

    for the different options you can use with grep.

    Comment


    • #3
      You mean you need to extract lines that begin with '1'? If so,

      grep '^1' file.txt > out.txt
      savetherhino.org

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Advancing Precision Medicine for Rare Diseases in Children
        by seqadmin




        Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
        12-16-2024, 07:57 AM
      • seqadmin
        Recent Advances in Sequencing Technologies
        by seqadmin



        Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

        Long-Read Sequencing
        Long-read sequencing has seen remarkable advancements,...
        12-02-2024, 01:49 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 12-17-2024, 10:28 AM
      0 responses
      30 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 12-13-2024, 08:24 AM
      0 responses
      45 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 12-12-2024, 07:41 AM
      0 responses
      34 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 12-11-2024, 07:45 AM
      0 responses
      45 views
      0 likes
      Last Post seqadmin  
      Working...
      X