SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
scripture with paired-end data - problem NicoBxl Bioinformatics 6 07-15-2013 05:01 PM
Metatranscriptomic data(paired-end,Illumina)mapping? jojohan Illumina/Solexa 2 09-04-2012 05:59 PM
Help with Illumina Paired-End Data adamba Bioinformatics 5 04-16-2012 12:36 PM
Discrepancy in paired-end Illumina data kopardev Bioinformatics 1 01-03-2012 11:23 PM
Paired-end Illumina data mchaisso Bioinformatics 7 07-17-2008 11:52 AM

Reply
 
Thread Tools
Old 10-19-2012, 01:15 PM   #1
yangfangisok
Junior Member
 
Location: Illinois

Join Date: Oct 2012
Posts: 8
Default Problem working with Illumina paired-end sequence data

I'm new to SEQanswers. I have Illumina paired-end sequence data. After individually removing low quality sequences, duplicated sequences and sequences with human DNA, the total number of sequences in the forward and reverse sequence data is different. This problem blocks me to do further analysis. In the future, I want to use seq2amos.pl to convert paired-end sequence data to .afg file. Then use AMOScmp-shortReads to assemble short reads.

Does anybody know any software or have script to help me figure it out?

Any help is much appreciated. Thank you.
yangfangisok is offline   Reply With Quote
Old 10-20-2012, 09:29 AM   #2
JackieBadger
Senior Member
 
Location: Halifax, Nova Scotia

Join Date: Mar 2009
Posts: 381
Default

http://seqanswers.com/forums/showthread.php?t=23881
JackieBadger is offline   Reply With Quote
Old 10-20-2012, 04:52 PM   #3
upendra_35
Senior Member
 
Location: USA

Join Date: Apr 2010
Posts: 102
Default

Here is the script that i used successfully to remove the unpaired reads from paired end reads. Hope this helps
Attached Files
File Type: pl PE_match.pl (1.8 KB, 50 views)
upendra_35 is offline   Reply With Quote
Old 10-20-2012, 06:51 PM   #4
yangfangisok
Junior Member
 
Location: Illinois

Join Date: Oct 2012
Posts: 8
Default Problem working with Illumina paired-end sequence data

Dear upendra_35,

Thanks for your help. I downloaded your script and changed .fq to .fa, because I already used FastX to convert fastq file to fasta file. I want to output .1.fasta and .2.fasta file. When I input "$ perl PE_match.pl --pe1 BVCN4.1.fa --pe2 BVCN4.2.fa", I am told that
"Use of uninitialized value in print at PE_match.pl line 56, <IN2> line 16723337.
Use of uninitialized value in print at PE_match.pl line 56, <IN2> line 16723337.
Use of uninitialized value in print at PE_match.pl line 56, <IN2> line 16723337."

Could you help me figure it out? I have no experience about perl language.
yangfangisok is offline   Reply With Quote
Old 10-20-2012, 10:31 PM   #5
upendra_35
Senior Member
 
Location: USA

Join Date: Apr 2010
Posts: 102
Default

Quote:
Originally Posted by yangfangisok View Post
Dear upendra_35,

Thanks for your help. I downloaded your script and changed .fq to .fa, because I already used FastX to convert fastq file to fasta file. I want to output .1.fasta and .2.fasta file. When I input "$ perl PE_match.pl --pe1 BVCN4.1.fa --pe2 BVCN4.2.fa", I am told that
"Use of uninitialized value in print at PE_match.pl line 56, <IN2> line 16723337.
Use of uninitialized value in print at PE_match.pl line 56, <IN2> line 16723337.
Use of uninitialized value in print at PE_match.pl line 56, <IN2> line 16723337."

Could you help me figure it out? I have no experience about perl language.
Forgot to mention that this script is intented to work with Illumina version < 1.8 and that too fq files only. So you better off using your original fq files and try this again.
upendra_35 is offline   Reply With Quote
Old 10-21-2012, 11:45 AM   #6
yangfangisok
Junior Member
 
Location: Illinois

Join Date: Oct 2012
Posts: 8
Default Problem working with Illumina paired-end sequence data

My original fq files are already paired.
yangfangisok is offline   Reply With Quote
Old 10-21-2012, 12:04 PM   #7
upendra_35
Senior Member
 
Location: USA

Join Date: Apr 2010
Posts: 102
Default

Quote:
Originally Posted by yangfangisok View Post
My original fq files are already paired.
Ok Let me get this right. You original fq files are paired and then you pass those files separately through Quality control and found out that after QC your paired end fq files have different number of reads. Right? What i meant to say before was to run the paired end fq files (after QC) using my script and finally you will have paired end fq files with the same number of reads and labelled as _matched_s_1.fq and _matched_s_2.fq. If you want to keep the unpaired reads separately let me know and i can give you another script. Hope this helps
upendra_35 is offline   Reply With Quote
Old 10-22-2012, 06:42 AM   #8
yangfangisok
Junior Member
 
Location: Illinois

Join Date: Oct 2012
Posts: 8
Default Problem working with Illumina paired-end sequence data

Thanks for your reply. Let me make it clear. After I get fq file, first of all, I remove low quality sequences and output fa file. Then I remove duplicated sequence and sequence with human DNA from fa file. Finally, I got two fa file with different number of sequence. I want to remove unpaired sequence from the two data and output two fa file with the same number of sequence.
yangfangisok is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:19 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO