HI, i am a pretty new for sequence analysis and totally new comer here. Anyone can help me? Very appreciate!!!
All I have are three fastq format separate raw data ( 40 million read sequences in each data), these are forward read data1, barcode read data2 and reverse read data 3. Each of the read sequence in three data are corresponding to each other with the same order (same ID) from beginning to the end. I have used a barcode file (6 different barcode) to split barcode data 2 into 6 separate files (by galaxy barcode splitter). Next, I need to get separate files for forward/reverse data by using this 6 separate files.
1. How can I get separate forward data1 and reverse data2 using my 6 separate barcode files. Is there software to do this? By the way, I do not have much bioinformatics background, any good suggestion?
2. How do I know where is the adapt sequences or if there are adapt sequences in the forward/reverse sequence from data 1 and 3 because this is very helpful for me to do adapt trim from original sequence?
following is just one example I extracted from my original data . All I have are only following 3 data with a separate barcode file. I do not have extra information like how is the barcode been designed, library construction or other..
Data 1 (forward read)
@IPAR1:2:1:4029:1196:1#0/1 ATTTTGCCACATACAAAAGAATCTACGTTCTTCTCAGCACCTCATGGAATCTTCTCTAAAATATATCATATAATAGGACACAAAAGAA
+ BHGHHHHHHHHGDDFHHHGGDGHFHFHHHHGD>GEEG>GFHHHHFHBBHFHHHHEHHHHHHBAFHHBBEHHHFEHGBECEHFHHFAHF
Data 2 (barcode read; TGACCTTG is the barcode tag and do not know yet what is ATCTCGT after tag)
@IPAR1:2:1:4029:1196:1#0/2
TGACCTTGATCTCGT
+
HIHIIGIIIH8CCDC
Data 3 (reverse read)
@IPAR1:2:1:4029:1196:1#0/3 GATATAATGGATGGGATTATTTCAATCTTTTATCTATTGAGGCTTCTTTTGTGTCCTATTATATGATATATTTTAGAGAAGATTCCAT
+ IIHIIIIHIIDEGGGEBG>GIIFIHHIHIIIIIFIDE4G@GG<GGEGBGG?AACCIIBIIBDIIIFDII>IIIIDIH@DFIGBI@IEE
All I have are three fastq format separate raw data ( 40 million read sequences in each data), these are forward read data1, barcode read data2 and reverse read data 3. Each of the read sequence in three data are corresponding to each other with the same order (same ID) from beginning to the end. I have used a barcode file (6 different barcode) to split barcode data 2 into 6 separate files (by galaxy barcode splitter). Next, I need to get separate files for forward/reverse data by using this 6 separate files.
1. How can I get separate forward data1 and reverse data2 using my 6 separate barcode files. Is there software to do this? By the way, I do not have much bioinformatics background, any good suggestion?
2. How do I know where is the adapt sequences or if there are adapt sequences in the forward/reverse sequence from data 1 and 3 because this is very helpful for me to do adapt trim from original sequence?
following is just one example I extracted from my original data . All I have are only following 3 data with a separate barcode file. I do not have extra information like how is the barcode been designed, library construction or other..
Data 1 (forward read)
@IPAR1:2:1:4029:1196:1#0/1 ATTTTGCCACATACAAAAGAATCTACGTTCTTCTCAGCACCTCATGGAATCTTCTCTAAAATATATCATATAATAGGACACAAAAGAA
+ BHGHHHHHHHHGDDFHHHGGDGHFHFHHHHGD>GEEG>GFHHHHFHBBHFHHHHEHHHHHHBAFHHBBEHHHFEHGBECEHFHHFAHF
Data 2 (barcode read; TGACCTTG is the barcode tag and do not know yet what is ATCTCGT after tag)
@IPAR1:2:1:4029:1196:1#0/2
TGACCTTGATCTCGT
+
HIHIIGIIIH8CCDC
Data 3 (reverse read)
@IPAR1:2:1:4029:1196:1#0/3 GATATAATGGATGGGATTATTTCAATCTTTTATCTATTGAGGCTTCTTTTGTGTCCTATTATATGATATATTTTAGAGAAGATTCCAT
+ IIHIIIIHIIDEGGGEBG>GIIFIHHIHIIIIIFIDE4G@GG<GGEGBGG?AACCIIBIIBDIIIFDII>IIIIDIH@DFIGBI@IEE