For previous studies I have contracted out both Illumina sequencing and RNAseq analysis of transcriptomes. An updated reference genome was added to NCBI this past year, and I want to take previously sequenced fastq files and realign them to the new reference genome in galaxy using Tophat and Cufflinks. I contacted the sequencing facilities to understand their protocol to determine if adapter sequences had been removed. They were unclear whether or not the trimming had already been executed, and their response time to my questions is lagging. They have however sent me this email.
"We performed a polyA selection with a kit from Ambion: MicroPoly(A) Purist Kit; required 1000ng as starting material
We used a cDNA kit from NuGEN: Ovation RNA-Seq System V2; we added 1ng of mRNA into the reaction. There is a SPIA adaptor sequence you will most likely want to trim off the 5' ends.
We used an in house cocktail for the Illumina libraries (combination of kits from Lucigen and NEB); we initiated the Illumina libraries with 500ng of cDNA. We used the indexed Illumina adaptors:
Multiplexing Adapters
5' P-GATCGGAAGAGCACACGTCT
5' ACACTCTTTCCCTACACGACGCTCTTCCGATCT
Multiplexing PCR Primer 1.0
5' AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
Multiplexing PCR Primer 2.0
5' GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
PCR Primer, Index 1
5’ CAAGCAGAAGACGGCATACGAGATXXXXXXGTGACTGGAGTTC
Let me know if you have more questions.
Thanks!
Catrina
Index Sequence - XXXXXX
Sample Library
GCCGCG CPAM-RS15-RS15-V1 CPAM-RS15-RS15-V1-cDNA-1-lib1a
GCTCCA CPAM-RS14-RS14-CS1 CPAM-RS14-RS14-CS1-cDNA-1-lib1a
GGCACA CPAM-RS9-RS9-CS1 CPAM-RS9-RS9-CS1-cDNA-1-lib1a
GGCCTG CPAM-RS12-RS12-V1 CPAM-RS12-RS12-V1-cDNA-1-lib1a
GTCCGC CPAM-RS11-RS11-CS1 CPAM-RS11-RS11-CS1-cDNA-1-lib1a
TAATCG CPAM-RS11-RS11-V1 CPAM-RS11-RS11-V1-cDNA-1-lib1a
TACAGC CPAM-RS16-RS16-V1 CPAM-RS16-RS16-V1-cDNA-1-lib1a
TATAAT CPAM-RS13-RS13-CS1 CPAM-RS13-RS13-CS1-cDNA-1-lib1a
Index Sequence - XXXXXX Sample Library
CACCGG CPAM-RS12-RS12-CS1 CPAM-RS12-RS12-CS1-cDNA-2-lib2a
CCTTAG CPAM-RS9-RS9-V1 CPAM-RS9-RS9-V1-cDNA-2-lib2a
CTCAGA CPAM-RS8-RS8-CS1 CPAM-RS8-RS8-CS1-cDNA-2-lib2a
CTGCTG CPAM-RS8-RS8-V1 CPAM-RS8-RS8-V1-cDNA-2-lib2a
GACGGA CPAM-RS13-RS13-V1 CPAM-RS13-RS13-V1-cDNA-2-lib1a
GATATA CPAM-RS14-RS14-V1 CPAM-RS14-RS14-V1-cDNA-2-lib1a
GATGCT CPAM-RS15-RS15-CS1 CPAM-RS15-RS15-CS1-cDNA-2-lib1a
GTAGAG CPAM-RS16-RS16-CS1 CPAM-RS16-RS16-CS1-cDNA-2-lib1a "
I want to make sure that there is no contamination of adapter sequences, even though the quality of my reads is high. I have tried creating a fasta file using the Multiplexing Adapters sequences listed above, and attempting trimming using Scythe, however the output file always is empty. Am I using the correct sequences? Do I need to trim individual fastq files using their appropriate indexes? I am still new to the process, so any advice would be appreciated.
"We performed a polyA selection with a kit from Ambion: MicroPoly(A) Purist Kit; required 1000ng as starting material
We used a cDNA kit from NuGEN: Ovation RNA-Seq System V2; we added 1ng of mRNA into the reaction. There is a SPIA adaptor sequence you will most likely want to trim off the 5' ends.
We used an in house cocktail for the Illumina libraries (combination of kits from Lucigen and NEB); we initiated the Illumina libraries with 500ng of cDNA. We used the indexed Illumina adaptors:
Multiplexing Adapters
5' P-GATCGGAAGAGCACACGTCT
5' ACACTCTTTCCCTACACGACGCTCTTCCGATCT
Multiplexing PCR Primer 1.0
5' AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
Multiplexing PCR Primer 2.0
5' GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
PCR Primer, Index 1
5’ CAAGCAGAAGACGGCATACGAGATXXXXXXGTGACTGGAGTTC
Let me know if you have more questions.
Thanks!
Catrina
Index Sequence - XXXXXX
Sample Library
GCCGCG CPAM-RS15-RS15-V1 CPAM-RS15-RS15-V1-cDNA-1-lib1a
GCTCCA CPAM-RS14-RS14-CS1 CPAM-RS14-RS14-CS1-cDNA-1-lib1a
GGCACA CPAM-RS9-RS9-CS1 CPAM-RS9-RS9-CS1-cDNA-1-lib1a
GGCCTG CPAM-RS12-RS12-V1 CPAM-RS12-RS12-V1-cDNA-1-lib1a
GTCCGC CPAM-RS11-RS11-CS1 CPAM-RS11-RS11-CS1-cDNA-1-lib1a
TAATCG CPAM-RS11-RS11-V1 CPAM-RS11-RS11-V1-cDNA-1-lib1a
TACAGC CPAM-RS16-RS16-V1 CPAM-RS16-RS16-V1-cDNA-1-lib1a
TATAAT CPAM-RS13-RS13-CS1 CPAM-RS13-RS13-CS1-cDNA-1-lib1a
Index Sequence - XXXXXX Sample Library
CACCGG CPAM-RS12-RS12-CS1 CPAM-RS12-RS12-CS1-cDNA-2-lib2a
CCTTAG CPAM-RS9-RS9-V1 CPAM-RS9-RS9-V1-cDNA-2-lib2a
CTCAGA CPAM-RS8-RS8-CS1 CPAM-RS8-RS8-CS1-cDNA-2-lib2a
CTGCTG CPAM-RS8-RS8-V1 CPAM-RS8-RS8-V1-cDNA-2-lib2a
GACGGA CPAM-RS13-RS13-V1 CPAM-RS13-RS13-V1-cDNA-2-lib1a
GATATA CPAM-RS14-RS14-V1 CPAM-RS14-RS14-V1-cDNA-2-lib1a
GATGCT CPAM-RS15-RS15-CS1 CPAM-RS15-RS15-CS1-cDNA-2-lib1a
GTAGAG CPAM-RS16-RS16-CS1 CPAM-RS16-RS16-CS1-cDNA-2-lib1a "
I want to make sure that there is no contamination of adapter sequences, even though the quality of my reads is high. I have tried creating a fasta file using the Multiplexing Adapters sequences listed above, and attempting trimming using Scythe, however the output file always is empty. Am I using the correct sequences? Do I need to trim individual fastq files using their appropriate indexes? I am still new to the process, so any advice would be appreciated.
Comment