![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
50 bp paired end reads vs. 100 bp single end reads | efoss | Bioinformatics | 12 | 01-15-2014 09:05 PM |
Bfast alignement with paired end reads in separate files | david.tamborero | Bioinformatics | 2 | 11-29-2011 08:49 AM |
paired-end reads mapped to genome.. gene with only one direction of paired-end reads? | danwiththeplan | Bioinformatics | 2 | 09-22-2011 03:06 AM |
Mira assembler: Medium sized genomes;How to use 2 separate files for paired-end reads | ndeshpan | Bioinformatics | 3 | 05-23-2011 06:59 PM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: UK Join Date: Sep 2012
Posts: 61
|
![]()
Hello all
I am using a software called MetaSim to generate reads based on given genome sequences. My problem is that the reads i produced are paired-ends in one multifasta file. the paired reads look like the following: >r1.1 |SOURCES={GI=61252,fw,624-696}|ERRORS={67:A}|SOURCE_1="Human poliovirus 1 Mahoney" (8915ea1a18cb58f4a76d99a56ece2a9018e105bc) ATTGGCCATCCGGTGAAAGTGAGACTCATTATCTATCTGTTTGCTGGATCCGCTCCATTGAGTGT GTATACT >r1.2 |SOURCES={GI=61252,bw,789-861}|ERRORS={}|SOURCE_1="Human poliovirus 1 Mahoney" (8915ea1a18cb58f4a76d99a56ece2a9018e105bc) GCGTTACTAGCTGAATCTCTATAATAATTAATGGTGGTGTAATTAATGGTAGAACCACCATACGC TCTATTT >r2.1 |SOURCES={GI=61252,fw,6323-6395}|ERRORS={}|SOURCE_1="Human poliovirus 1 Mahoney" (8915ea1a18cb58f4a76d99a56ece2a9018e105bc) CCACCAGTGCTGGCTACCCTTATGTAGCAATGGGAAAGAAGAAGAGAGACATCTTGAACAAACAA ACCAGAG >r2.2 |SOURCES={GI=61252,bw,6502-6574}|ERRORS={}|SOURCE_1="Human poliovirus 1 Mahoney" (8915ea1a18cb58f4a76d99a56ece2a9018e105bc) AGCATATAGGTTCCCAAAAGCCATTCTCATTGCCACTGAGTCATTCAAACTAGAAGCTTCAATTA ATCTGGA I want the backward reads in a separated file, and the forward reads in another file. Does anyone have any idea about a perl script or any other tool to separate these two reads into two different files; then i will be able to align them using Bowtie or any other alignment tool. Many thanks |
![]() |
![]() |
![]() |
#2 |
Devon Ryan
Location: Freiburg, Germany Join Date: Jul 2011
Posts: 3,480
|
![]()
You could just use grep. Something like this should work:
Code:
grep -A 1 ".1 " reads.fa > reads_1.fa grep -A 1 ".2 " reads.fa > reads_2.fa |
![]() |
![]() |
![]() |
#3 |
PhD Student
Location: Denmark Join Date: Jul 2012
Posts: 164
|
![]() |
![]() |
![]() |
![]() |
#4 |
Devon Ryan
Location: Freiburg, Germany Join Date: Jul 2011
Posts: 3,480
|
![]() |
![]() |
![]() |
![]() |
#5 |
PhD Student
Location: Denmark Join Date: Jul 2012
Posts: 164
|
![]()
My bad! Did not notice the -A 1 at the beginning.
|
![]() |
![]() |
![]() |
#6 |
Member
Location: UK Join Date: Sep 2012
Posts: 61
|
![]()
Hello
I still get both 1.1 and 1.2 in the output file. Does it have to do with the extension of my file which is .fna, this how i am doing it; in the terminal i am in the directory where my file is, my file is: test.fna: grep -A 1 ".1 " test.fna > reads_1.fna grep -A 1 ".2 " test.fna > reads_2.fna Many thanks for your help |
![]() |
![]() |
![]() |
#7 | |
Devon Ryan
Location: Freiburg, Germany Join Date: Jul 2011
Posts: 3,480
|
![]() Quote:
Code:
grep -A 1 "\.1 " test.fna > reads_1.fna grep -A 1 "\.2 " test.fna > reads_2.fna |
|
![]() |
![]() |
![]() |
#8 |
Member
Location: UK Join Date: Sep 2012
Posts: 61
|
![]()
Thanks very Much dpryan, it is working now. But yes i do have the separator and when i try to use it, the terminal says: unrecognized option `--no-group-separator'. I am not sure what the reason could be! My terminal is Version 2.3 (309).
Many thanks! |
![]() |
![]() |
![]() |
#9 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,080
|
![]()
You can add a second grep to remove the "--"
Code:
grep -v "\--" reads_1.fna > reads_1_final.fna Code:
grep -A 1 "\.1 " test.fna | grep -v "\--" > reads_1.fna Last edited by GenoMax; 09-28-2012 at 07:57 AM. |
![]() |
![]() |
![]() |
#10 |
Member
Location: UK Join Date: Sep 2012
Posts: 61
|
![]()
Thanks Very much, it is perfectly working!
|
![]() |
![]() |
![]() |
#11 |
Junior Member
Location: Chile Join Date: Mar 2013
Posts: 2
|
![]()
Hi all,
i have the same problem reported here but if do "grep -v "\--" > reads_1.fna" I also remove lines starting with "--"... Thanks in advance |
![]() |
![]() |
![]() |
#12 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,080
|
![]() |
![]() |
![]() |
![]() |
#13 |
Junior Member
Location: Chile Join Date: Mar 2013
Posts: 2
|
![]()
Thanks for ur soon response...
OK, I explain better.... I have something like this @IC5OSZZ01DTNH9/1 GTTGTCGTGGCTCATGTTCGAGTTATCCATTTGTGCGAATGCGCCTGCTGATACCATG +IC5OSZZ01DTNH9/1 --GIIIIIIIIIIII444IIIIEEIEIII444IIIIIIIIIIEGGHIIIII@888EII -- if I remove the -- with "grep -v "\--" > reads_1.fna" it'll remove the qualities row too... I hope you can help me.. Thanks! |
![]() |
![]() |
![]() |
#14 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,080
|
![]()
What OS are you using?
Did you try the "--no-group-separator" option mentioned by Devon in post #7 instead of the second grep? |
![]() |
![]() |
![]() |
#15 | |
Devon Ryan
Location: Freiburg, Germany Join Date: Jul 2011
Posts: 3,480
|
![]() Quote:
Code:
grep -v "\--" Code:
grep -v "^--$" |
|
![]() |
![]() |
![]() |
Thread Tools | |
|
|