Seqanswers Leaderboard Ad

**rhinoceros** · 08-09-2013, 09:51 AM

General related question: What is the largest insert size that can be used with Illumina sequencing if you want the pairs to actually overlap? Is it something like 200bp?

**luc** · 08-09-2013, 10:16 AM

Hi,
which MP library protocol did you use? One using linkers or not? Did you trim the linkers?
How exactly did you carry out the alignments?

**luc** · 08-09-2013, 10:21 AM

MiSeq does allow for 2x250 bp PE sequencing;
HiSeq does allow for 2x150 bp PE sequencing;

How much overlap do you want?

There is one program which claims to merge reads even without actual overlap - given enough coverage:

"COPE: An accurate k-mer based pair-end reads connection tool
to facilitate genome assembly"

Originally posted by rhinoceros View Post

General related question: What is the largest insert size that can be used with Illumina sequencing if you want the pairs to actually overlap? Is it something like 200bp?

**GenoMax** · 08-09-2013, 10:24 AM

Originally posted by rhinoceros View Post

General related question: What is the largest insert size that can be used with Illumina sequencing if you want the pairs to actually overlap? Is it something like 200bp?

Upwards of 400 bp. With MiSeq 2 x 250 bp runs you will get overlap in the middle. For HiSeq runs would be shorter since 2 x 150 bp is the max supported on the 2500 in rapid mode at the moment.

**agupta29** · 08-09-2013, 10:33 AM

The libraries were prepared by Illumina Mate Pair Library Preparation Kit v2 and sequenced using HiSeq2000 generating 2*100bp reads.

The alignments were done using bwa aln with default parameters. I did no pre-processing of data in this current analysis.

Code:

bwa aln -t 4  /BWAIndex/genome.fa L001_R1.fastq.gz > L001_1.sai
bwa aln -t 4  /BWAIndex/genome.fa L001_R2.fastq.gz > L001_2.sai
bwa sampe /BWAIndex/genome.fa 1.sai 2.sai 1.fastq.gz 2.fastq.gz > L001.sam
samtools view -bS L001.sam > L001.bam
samtools sort L001.bam L001_sorted

The duplicates were marked and removed, as I previously mentioned, using picard MarkDuplicates tool.

Originally posted by luc View Post

Hi,
which MP library protocol did you use? One using linkers or not? Did you trim the linkers?
How exactly did you carry out the alignments?

**luc** · 08-09-2013, 11:12 AM

Hi,

I believe the "old" bwa has problems with mapping mate pairs:

Aligners for Illumina's mate-pairs - SEQanswers

http://seqanswers.com/forums/showthread.php?t=5085

Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

BWA-mem, Bowtie, Novalign are better suited to my knowledge.

Similarly the old Illumina MP kit produces lots of artifacts and PCR duplicates; the Nextera based mate-pair kit is more efficient.

What was the fragment size of your library? People used to trim the MP reads to the first 38 bases each, in part to avoid mapping chimeric sequences.

**agupta29** · 08-09-2013, 11:24 AM

BWA and BWA mem
I an using reverse-complemented reads. That brings the --rf orientation mate pair reads into --fr paired end orientation, which can then be used for bwa alignment. I am running bwa-mem as I write this so that I have a direct comparison between bwa/ bwa mem

Library prep:
The samples were sent to a commercial vendor. So, I did nothing with the library or sequencing steps. We just got the data from the vendor. But, I will keep that in mind for the future.

**luc** · 08-09-2013, 03:40 PM

Hi agupta,

you could look at the distances of PE alignments to get an idea if their sequencing libraries had short inserts and thus were more likely to include chimeric reads.

**diptarka** · 08-09-2013, 10:08 PM

I have carried out de novo assembly of an organism of interest using velvet optimiser. I have contig files. How can i now, predict genes from the sequences? I have also carried out comparsion of the assembly with a reference using abacas and tried to view it usin artemis comparsion viewer. But, i am unable to understand the contig ordering part. How does one do that? Secondly, what is the way to predict genes from the contigs?

**GenoMax** · 08-10-2013, 05:14 PM

Originally posted by diptarka View Post

I have carried out de novo assembly of an organism of interest using velvet optimiser. I have contig files. How can i now, predict genes from the sequences? I have also carried out comparsion of the assembly with a reference using abacas and tried to view it usin artemis comparsion viewer. But, i am unable to understand the contig ordering part. How does one do that? Secondly, what is the way to predict genes from the contigs?

You should create a separate thread for this question since:
a) your questions are not related to current thread
b) people who could potentially answer your question will not notice it if it is embedded here.

New threads can be started by doing following:

Seqanswers.com --> Forums (Navigation Menu on Left) --> Select an appropriate forum (e.g. Bioinformatics) --> "New Thread" button at the top left corner on next page.

That said: What type of organism is this (prokaryote/eukaryote)?

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Yesterday, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin Yesterday, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

Mate pair sequencing - quality, duplication, throughput

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News