Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Script Help

    Hello everyone,

    I've been racking my brain trying to create a script to make a certain file, but I have had no success in my attempts.

    I have a file like this:
    Code:
    ACGCCGGCCA
    GTGAATTGTA
    ATACGACTCA
    CTATAGGGCG
    AATTGGGCCC
    TCTAGATG
    All I want to do is to create a file like the following:
    Code:
    1	ACGCCGGCCA
    11	GTGAATTGTA
    21	ATACGACTCA
    31	CTATAGGGCG
    41	AATTGGGCCC
    51	TCTAGATGCA
    61	TGCTCGAGCG
    71	GCCGCCAGTG
    It should be something relatively easy I'm just really missing. I've been trying it using awk, but have had little luck.

    I believe it is basically just this code:
    Code:
    awk '{print NR?????? "\t" $0}'
    I'm just not sure what the ??????????? should be. Any help with this would be great. Thanks in advance.

  • #2
    If it's ok with perl, here's a simple perl one liner:

    Code:
    perl -e ' $pos=1; while(<>) { $line=$_; print "$pos\t$line"; $pos+=10; } ' inputfile.txt > outputfile.txt
    replace inputfile.txt with your file containing your reads, and outputfile.txt with a desired output file name.

    I'm assuming you want the position of the base at the start of every line. You can replace the '10' in the code if you desire another increment. The delimiter here is a tab (\t).

    Comment


    • #3
      I was going to say something very similar to Kennels's answer. If you are going to be doing this kind of thing frequently, Perl is awesome!

      Comment


      • #4
        Here is the awk:

        Code:
        awk -v N=1 '{print N "\t" $0} {N += 10}'

        Comment


        • #5
          Use MS Excel to open the file, insert a column in the front of the data.
          Put 1 in the A1 cell. In cell B1, put the formula =A1+10. Then fill down the column.
          Then save the file as a text file, tab delimited.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin




            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
            04-22-2024, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Yesterday, 08:47 AM
          0 responses
          16 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          60 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          60 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          54 views
          0 likes
          Last Post seqadmin  
          Working...
          X