Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem working with Illumina paired-end sequence data

    I'm new to SEQanswers. I have Illumina paired-end sequence data. After individually removing low quality sequences, duplicated sequences and sequences with human DNA, the total number of sequences in the forward and reverse sequence data is different. This problem blocks me to do further analysis. In the future, I want to use seq2amos.pl to convert paired-end sequence data to .afg file. Then use AMOScmp-shortReads to assemble short reads.

    Does anybody know any software or have script to help me figure it out?

    Any help is much appreciated. Thank you.

  • #2
    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

    Comment


    • #3
      Here is the script that i used successfully to remove the unpaired reads from paired end reads. Hope this helps
      Attached Files

      Comment


      • #4
        Problem working with Illumina paired-end sequence data

        Dear upendra_35,

        Thanks for your help. I downloaded your script and changed .fq to .fa, because I already used FastX to convert fastq file to fasta file. I want to output .1.fasta and .2.fasta file. When I input "$ perl PE_match.pl --pe1 BVCN4.1.fa --pe2 BVCN4.2.fa", I am told that
        "Use of uninitialized value in print at PE_match.pl line 56, <IN2> line 16723337.
        Use of uninitialized value in print at PE_match.pl line 56, <IN2> line 16723337.
        Use of uninitialized value in print at PE_match.pl line 56, <IN2> line 16723337."

        Could you help me figure it out? I have no experience about perl language.

        Comment


        • #5
          Originally posted by yangfangisok View Post
          Dear upendra_35,

          Thanks for your help. I downloaded your script and changed .fq to .fa, because I already used FastX to convert fastq file to fasta file. I want to output .1.fasta and .2.fasta file. When I input "$ perl PE_match.pl --pe1 BVCN4.1.fa --pe2 BVCN4.2.fa", I am told that
          "Use of uninitialized value in print at PE_match.pl line 56, <IN2> line 16723337.
          Use of uninitialized value in print at PE_match.pl line 56, <IN2> line 16723337.
          Use of uninitialized value in print at PE_match.pl line 56, <IN2> line 16723337."

          Could you help me figure it out? I have no experience about perl language.
          Forgot to mention that this script is intented to work with Illumina version < 1.8 and that too fq files only. So you better off using your original fq files and try this again.

          Comment


          • #6
            Problem working with Illumina paired-end sequence data

            My original fq files are already paired.

            Comment


            • #7
              Originally posted by yangfangisok View Post
              My original fq files are already paired.
              Ok Let me get this right. You original fq files are paired and then you pass those files separately through Quality control and found out that after QC your paired end fq files have different number of reads. Right? What i meant to say before was to run the paired end fq files (after QC) using my script and finally you will have paired end fq files with the same number of reads and labelled as _matched_s_1.fq and _matched_s_2.fq. If you want to keep the unpaired reads separately let me know and i can give you another script. Hope this helps

              Comment


              • #8
                Problem working with Illumina paired-end sequence data

                Thanks for your reply. Let me make it clear. After I get fq file, first of all, I remove low quality sequences and output fa file. Then I remove duplicated sequence and sequence with human DNA from fa file. Finally, I got two fa file with different number of sequence. I want to remove unpaired sequence from the two data and output two fa file with the same number of sequence.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Advancing Precision Medicine for Rare Diseases in Children
                  by seqadmin




                  Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                  12-16-2024, 07:57 AM
                • seqadmin
                  Recent Advances in Sequencing Technologies
                  by seqadmin



                  Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                  Long-Read Sequencing
                  Long-read sequencing has seen remarkable advancements,...
                  12-02-2024, 01:49 PM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 12-17-2024, 10:28 AM
                0 responses
                22 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 12-13-2024, 08:24 AM
                0 responses
                42 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 12-12-2024, 07:41 AM
                0 responses
                28 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 12-11-2024, 07:45 AM
                0 responses
                42 views
                0 likes
                Last Post seqadmin  
                Working...
                X