Seqanswers Leaderboard Ad

**RedLightPanic** · 03-05-2013, 05:35 AM

Sorry, missalignment of the read structure. Corrected below:

R1-5' => [Adapter 1] [some sequence 1] [Adapter 2] [some sequence 2] [Adapter 2] xxxx 3'
R2-3' => xxx [Adapter 1] [some sequence 1] [Adapter 2] [some sequence 2] [Adapter 2] 5'

**kcchan** · 03-05-2013, 10:48 AM

Do you have any idea how the libraries were generated? This certainly does not look like a standard Illumina TruSeq library. It could be that something is going wrong during one of the adapter ligation steps that is causing fragments to concatamerize. It's also strange that adapter 1 is being sequenced first in both reads.

**microgirl123** · 03-05-2013, 12:04 PM

This looks very strange to me. I don't think any of your reads should start with adapter sequence. They can end with adapter sequence if your insert is shorter than your read length. Is your "some sequence 2" a short sequence of indexing nucleotides?

I can only think that the actual Illumina adapters are outside of your strange adapter reads.

**RedLightPanic** · 03-06-2013, 03:52 AM

kcchan and microgirl123,

Thanks a lot for your posts.

A couple of clarifications:

Do you have any idea how the libraries were generated?

Hope this helps:
- RNA fragmentation
- Double strand cDNA synthesis
- End repair
- Linkers/adapters are ligated at the 3' and 5' ends of the double-stranded RNA (I refer to those sequences as adapter 1 and 2)
- Denaturing
- PCR amplification (primers are complementary to the linkers)
- HTS

It's also strange that adapter 1 is being sequenced first in both reads

The actual sequences are:

HTML Code:

R1 => [Adapter 1] [some sequence 1] [Adapter 2] [some sequence 2] [Adapter 2] xxxx
R2 => [Adapter 2] [some sequence 2] [Adapter 2] [some sequence 1] [Adapter 1] xxxx

But, since they are paired end reads, I was trying to show that both sequences were complementary:

HTML Code:

R1-5' =>     [Adapter 1] [some sequence 1] [Adapter 2] [some sequence 2] [Adapter 2] xxxx 3'
R2-3' => xxx [Adapter 1] [some sequence 1] [Adapter 2] [some sequence 2] [Adapter 2]      5'

Sorry about the confusion.

Is your "some sequence 2" a short sequence of indexing nucleotides?

No, I've already checked for the indexing nucleotides. Anyway I've taken a deeper look at this "some sequence 2", and I think it is also containing the Adapter 1, with a few mismatches.

I can only think that the actual Illumina adapters are outside of your strange adapter reads.

Adapter sequences were provided to me by the experimental people, and they are those Adapter 1 and 2 that I show.

**kcchan** · 03-06-2013, 09:44 AM

Ah, I see what's going on now. This is a directional RNA-Seq library. If I understand correctly, the 3' Adapter sequence is showing up multiple times. This is indicative of an adapter that was designed incorrectly. The oligo for the 3' adapter must have a modification at the 3' end in order to prevent it ligating to other inserts; either a dideoxy nucleotide or amino modification. I'm guessing this was not done and the shorter RNA fragments concatamerized during the 5' ligation reaction. The data is still good, but you'll have to do some work to separate the individual inserts in the reads.

**RedLightPanic** · 03-07-2013, 10:52 AM

Thanks kcchan!

Any suggestion on how to proceed?

Since the majority of paired reads (r1/r2) show a high degree of overlapping, maybe I could build single reads from them, trim the adaptors, and map the resulting short reads to the reference genome.

**kcchan** · 03-07-2013, 12:21 PM

Yea, I think the best you can do with those reads is to use them as a single end read. You're going to have to do a bit of work to get the reads cleaned up, however. Not only will you need to trim off the adapters, but also separate each sequence in between the adapters and treat it as individuals reads.

**HESmith** · 03-07-2013, 01:27 PM

You could split the reads on the index sequences (forward and reverse complement) using perl. You would also need to assign independent read names and (for quality scores) keep track of the positional information, plus discard reads below a certain length. Unless these are supposed to be small RNAs or it's a SAGE type of experiment, the small insert sizes would cause me to question the sample quality.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 39 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 41 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 35 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Illumina paired-end reads. More than 2 adapter sequences.

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News