Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • shuffleSequences_fastq.pl

    I am using the perl script (shuffleSequences_fastq.pl) to prepare the intervaled fastq file for velvet and I got the first read in read1 followed by the first read in read2 BUT then I got 3 different reads of reads 2 followed by their correspondence in read1 and so on- I was wondering if this ok or is this below script right?
    a read1
    a read2
    b read1
    b read2
    c read2
    d read2
    c read1
    d read1 and so on

    shuffleSequences_fastq.pl


    #!/usr/bin/perl

    if (!@ARGV) {
    print "Usage: $0 forward_reads.fa reverse_reaads.fa outfile.fa\n";
    print "\tforward_reads.fa / reverse_reads.fa : paired reads to be merged\n";
    print "\toutfile.fa : outfile to be created\n";
    system.exit(0);
    }

    $filenameA = $ARGV[0];
    $filenameB = $ARGV[1];
    $filenameOut = $ARGV[2];

    die "Could not open $filenameA" unless (-e $filenameA);
    die "Could not open $filenameB" unless (-e $filenameB);

    open FILEA, "< $filenameA";
    open FILEB, "< $filenameB";

    open OUTFILE, "> $filenameOut";

    my ($lineA, $lineB);

    $lineA = <FILEA>;
    $lineB = <FILEB>;

    while(defined $lineA) {
    print OUTFILE $lineA;
    $lineA = <FILEA>;
    while (defined $lineA && $lineA !~ m/>/) {
    print OUTFILE $lineA;
    $lineA = <FILEA>;
    }

    print OUTFILE $lineB;
    $lineB = <FILEB>;
    while (defined $lineB && $lineB !~ m/>/) {
    print OUTFILE $lineB;
    $lineB = <FILEB>;
    }
    }

  • #2
    If you use a recent version of velvet, you don't need to use the shuffleSequences scripts, you can leave the reads as 2 separate files and use the flag '-separate' when running velveth.

    I have used the shuffleSequences_fastq.pl script that comes with velvet in the past, and never had any problems.

    It is not OK if your files don't interleave properly, you will have problems with the assembly or alignment programs further downstream.

    Comment


    • #3
      Are you trying to interleave fasta files or fastq files?

      If it is fastq files, your script won't work because the header lines for each read should begin with '@'.

      Change the two lines with the regular expressions.

      Change:
      Code:
      while (defined $lineA && $lineA !~ m/>/) {
      to
      Code:
      while (defined $lineA && $lineA !~ m/^@/) {

      Comment


      • #4
        thank you very much- also, I should change
        while (defined $lineB && $lineB !~ m/>/)
        to
        while (defined $lineB && $lineB !~ m/^@/)

        Comment


        • #5
          Originally posted by mmmm View Post
          thank you very much- also, I should change
          while (defined $lineB && $lineB !~ m/>/)
          to
          while (defined $lineB && $lineB !~ m/^@/)
          I wouldn't edit the script for fasta files, instead just use the script designed for fastq. You can find the script on the velvet github site. This assumes 4 line fastq. You can try Pairfq (documentation) for a more flexible solution (reading multiline fasta/q, compressed/uncompressed), but you may want to start with trying the script distributed with velvet.
          Last edited by SES; 04-03-2014, 01:53 AM.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Recent Developments in Metagenomics
            by seqadmin





            Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
            09-23-2024, 06:35 AM
          • seqadmin
            Understanding Genetic Influence on Infectious Disease
            by seqadmin




            During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

            Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
            09-09-2024, 10:59 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 10-02-2024, 04:51 AM
          0 responses
          11 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 10-01-2024, 07:10 AM
          0 responses
          19 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 09-30-2024, 08:33 AM
          0 responses
          23 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 09-26-2024, 12:57 PM
          0 responses
          17 views
          0 likes
          Last Post seqadmin  
          Working...
          X