I've never processed RNA-seq reads (only done bowtie for ChIP-Seq processing) so I wanted to clarify some things. I found a dataset off GEO where the paired end reads have been concatenated into a single 282 read. I'm wondering what the best way to process it is?
Is the best way to split it in half into read1 and read2, and then trim adapters from them before mapping with Tophat. However I don't know if thats correct; is it better to map it unsplit?
Also, for one thing I don't know what the barcode sequence is for some of the datasets are (FASTQC didn't give me the barcdode for 5 of the 8 samples). Has anyone tried trimming adapters using TTTTTTNNNNNATTTTTTT, where N refers to the barcode sequence in FASTX-CLIPPER?
Thanks.
Is the best way to split it in half into read1 and read2, and then trim adapters from them before mapping with Tophat. However I don't know if thats correct; is it better to map it unsplit?
Also, for one thing I don't know what the barcode sequence is for some of the datasets are (FASTQC didn't give me the barcdode for 5 of the 8 samples). Has anyone tried trimming adapters using TTTTTTNNNNNATTTTTTT, where N refers to the barcode sequence in FASTX-CLIPPER?
Thanks.
Comment