Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem working with Illumina paired-end sequence data

    I'm new to SEQanswers. I have Illumina paired-end sequence data. After individually removing low quality sequences, duplicated sequences and sequences with human DNA, the total number of sequences in the forward and reverse sequence data is different. This problem blocks me to do further analysis. In the future, I want to use seq2amos.pl to convert paired-end sequence data to .afg file. Then use AMOScmp-shortReads to assemble short reads.

    Does anybody know any software or have script to help me figure it out?

    Any help is much appreciated. Thank you.

  • #2
    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

    Comment


    • #3
      Here is the script that i used successfully to remove the unpaired reads from paired end reads. Hope this helps
      Attached Files

      Comment


      • #4
        Problem working with Illumina paired-end sequence data

        Dear upendra_35,

        Thanks for your help. I downloaded your script and changed .fq to .fa, because I already used FastX to convert fastq file to fasta file. I want to output .1.fasta and .2.fasta file. When I input "$ perl PE_match.pl --pe1 BVCN4.1.fa --pe2 BVCN4.2.fa", I am told that
        "Use of uninitialized value in print at PE_match.pl line 56, <IN2> line 16723337.
        Use of uninitialized value in print at PE_match.pl line 56, <IN2> line 16723337.
        Use of uninitialized value in print at PE_match.pl line 56, <IN2> line 16723337."

        Could you help me figure it out? I have no experience about perl language.

        Comment


        • #5
          Originally posted by yangfangisok View Post
          Dear upendra_35,

          Thanks for your help. I downloaded your script and changed .fq to .fa, because I already used FastX to convert fastq file to fasta file. I want to output .1.fasta and .2.fasta file. When I input "$ perl PE_match.pl --pe1 BVCN4.1.fa --pe2 BVCN4.2.fa", I am told that
          "Use of uninitialized value in print at PE_match.pl line 56, <IN2> line 16723337.
          Use of uninitialized value in print at PE_match.pl line 56, <IN2> line 16723337.
          Use of uninitialized value in print at PE_match.pl line 56, <IN2> line 16723337."

          Could you help me figure it out? I have no experience about perl language.
          Forgot to mention that this script is intented to work with Illumina version < 1.8 and that too fq files only. So you better off using your original fq files and try this again.

          Comment


          • #6
            Problem working with Illumina paired-end sequence data

            My original fq files are already paired.

            Comment


            • #7
              Originally posted by yangfangisok View Post
              My original fq files are already paired.
              Ok Let me get this right. You original fq files are paired and then you pass those files separately through Quality control and found out that after QC your paired end fq files have different number of reads. Right? What i meant to say before was to run the paired end fq files (after QC) using my script and finally you will have paired end fq files with the same number of reads and labelled as _matched_s_1.fq and _matched_s_2.fq. If you want to keep the unpaired reads separately let me know and i can give you another script. Hope this helps

              Comment


              • #8
                Problem working with Illumina paired-end sequence data

                Thanks for your reply. Let me make it clear. After I get fq file, first of all, I remove low quality sequences and output fa file. Then I remove duplicated sequence and sequence with human DNA from fa file. Finally, I got two fa file with different number of sequence. I want to remove unpaired sequence from the two data and output two fa file with the same number of sequence.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM
                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 06:37 PM
                0 responses
                10 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 06:07 PM
                0 responses
                9 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-22-2024, 10:03 AM
                0 responses
                51 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-21-2024, 07:32 AM
                0 responses
                67 views
                0 likes
                Last Post seqadmin  
                Working...
                X