SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
50 bp paired end reads vs. 100 bp single end reads efoss Bioinformatics 12 01-15-2014 08:05 PM
How to count number of mapped paired-end and single-end rna-seq reads repinementer Bioinformatics 8 01-06-2013 05:06 AM
Can Cuffdiff treat paired-end and single-end reads at the same time? zun RNA Sequencing 3 06-12-2012 05:37 PM
paired-end reads mapped to genome.. gene with only one direction of paired-end reads? danwiththeplan Bioinformatics 2 09-22-2011 02:06 AM
Plasmid contamination in Long Tag Paired End library kmcarr 454 Pyrosequencing 3 03-11-2009 04:29 PM

Reply
 
Thread Tools
Old 02-23-2014, 06:13 PM   #1
Abujamel_t
Junior Member
 
Location: Canada

Join Date: Feb 2014
Posts: 6
Default Filtration of plasmid metagenome paired end reads

Hi all,

I am studying plasmids metagenome from clinical samples. The plasmids were captured from metagenomic DNA by digestion of linear DNA (leaving closed circular DNA safe), random insertion of transposon, and cloning into E. coli. Then, I used the purified plasmids from E. coli clones to construct the sequencing library. My plan next is to assemble the paired end reads generated from the sequencing. Now, I am expecting that the reads will have high amount of transposon and E. coli sequences that were introduced during the plasmid isolation. My question is, what is the best way to filter out reads that belong to the transposon and E. coli, leaving only transposon and E. coli FREE reads for the the assembly step? I think these sequences will highly affect the assembly process. I tried Bowtie 2.0, but it doesn't seem to be doing a good job since most of the scaffolds that I got after the de novo assembly belong to the cloning strain.
Platform is Illumina HiSeq2500 (reads are 2x150bp)
the assembler is SOAPdenovo

I hope someone could help me in this matter.

Cheers,
TJ
Abujamel_t is offline   Reply With Quote
Old 02-24-2014, 12:22 AM   #2
relipmoc
Member
 
Location: Los Angeles, CA

Join Date: Jul 2011
Posts: 58
Default

Have you checked what pattern a typical read has? For example, [META_GENOMIC_SEQ] [TRANSPOSON_ELEMENT] [E_COLI_SEQ] [ADAPTER_SEQ].

Can we divide the reads into three categories, namely (1) pure metagenomic DNA; (2) junction DNA; (3) E-coli only DNA?
If that is true. Then you can filter out reads of category 3 based on alignment to E-coli reference genome and trim out boundary sequences from junction DNA.
relipmoc is offline   Reply With Quote
Old 02-24-2014, 06:05 AM   #3
Abujamel_t
Junior Member
 
Location: Canada

Join Date: Feb 2014
Posts: 6
Default

Thanks relipmoc for your reply,

Forgive my ignorance, but I am not quite sure about what you mean in the first part. It is a shotgun library. So I don't know if they would have a pattern other than barcodes in the 5' ends.

Dividing the reads into three categories is exactly what I want to do. Do you know what software I should use to do this? I assume I need to do that on two levels. 1st alignment against E. coli genome and take the unaligned sequences and do a second alignments against the transposon sequences. The final unaligned sequences should be used then for my plasmid de novo assembly. Is this the right way?

Thanks,
TJ
Abujamel_t is offline   Reply With Quote
Old 02-24-2014, 09:49 AM   #4
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

I made a program specifically for separating reads, since we work with a lot of metagenomic communities.

First, download BBMap

Then run this:

bbsplit.sh in=reads.fq ref=ecoli.fa,transposon.fa basename=out_%.fq outu=clean.fq int=t

This will produce 3 output files:
out_ecoli.fq (ecoli reads)
out_transposon.fq (transposon reads)
clean.fq (all other reads: 'outu' means unmapped output)

It's very fast. The command above is for paired reads that are interleaved (the int=t flag). If the paired reads are in 2 files, use 'in1=' and 'in2=' and leave off the 'int' flag. The output will be interleaved, but if you want it in twin files, you can say 'outu1=clean1.fq outu2=clean2.fq'
Brian Bushnell is offline   Reply With Quote
Old 02-24-2014, 11:12 AM   #5
Abujamel_t
Junior Member
 
Location: Canada

Join Date: Feb 2014
Posts: 6
Default

Hello Brian,

I have downloaded the program and when I run the command I got the following error:

Exception in thread "main" java.lang.UnsupportedClassVersionError: align2/BBSplitter : Unsupported major.minor version 51.0
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:643)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:277)
at java.net.URLClassLoader.access$000(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:212)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
Could not find the main class: align2.BBSplitter. Program will exit.

the command line that I used is as following:
bbsplit.sh in1=plasmid_R1.fastq in2=plasmid_R2.fastq ref=Escherichia_coli.fasta,Transposon.fasta basename=out.fastq outu1=plasmid_R1_clean.fastq outu2=plasmid_R2_clean.fastq qin=33 -Xmx200g

I really appreciate your help.

Cheers,
TJ
Abujamel_t is offline   Reply With Quote
Old 02-24-2014, 11:19 AM   #6
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

TJ,

You have java 6 or earlier installed; BBMap is compiled for java 7. You can either download and install java 7 (download the JRE or JDK), or wait for me to post a version compiled for java 6 (I'll do that later today), or try to recompile it yourself if you have javac in your path (run compile.sh). I'll post here once I put up a java 6 version. But I suggest you install java 7 (or get a sysadmin to do it).
Brian Bushnell is offline   Reply With Quote
Old 02-24-2014, 12:23 PM   #7
Abujamel_t
Junior Member
 
Location: Canada

Join Date: Feb 2014
Posts: 6
Default

Hi Brain,

I will try to install Java 7 and re-run the comman again. I will let you know of the result.

Thanks,
TJ
Abujamel_t is offline   Reply With Quote
Old 02-26-2014, 11:46 AM   #8
Abujamel_t
Junior Member
 
Location: Canada

Join Date: Feb 2014
Posts: 6
Thumbs up

Hi Brian,

I have updated the java and the software worked perfect!!! It is as you said very quick compared to other softwares I tried.

For the de novo assmbly, almost all of my scafolds belong to plasmids with very small number of contigs belong to the cloning strain.

My conclusion is your software performed better than 4 other alignment softwares that I used before. BBMap is excellent for filtering unwanted reads (reads belong to the tansposon and cloning bacteria in my case) from metagenomic data.

Thank you very much for your help.
TJ
Abujamel_t is offline   Reply With Quote
Old 02-26-2014, 02:05 PM   #9
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Quote:
Originally Posted by Abujamel_t View Post
Hi Brian,

I have updated the java and the software worked perfect!!! It is as you said very quick compared to other softwares I tried.

For the de novo assmbly, almost all of my scafolds belong to plasmids with very small number of contigs belong to the cloning strain.

My conclusion is your software performed better than 4 other alignment softwares that I used before. BBMap is excellent for filtering unwanted reads (reads belong to the tansposon and cloning bacteria in my case) from metagenomic data.

Thank you very much for your help.
TJ
You're welcome; and thanks for the feedback!
Brian Bushnell is offline   Reply With Quote
Old 02-26-2014, 03:06 PM   #10
rnaeye
Member
 
Location: East Cost

Join Date: May 2011
Posts: 79
Default

Hi Abujamel_t,
I was wondering why you used linear DNA digestion to remove non-plasmid DNA. Is there a particular reason that you did not want to use a plasmid extraction kit. Thank you for your answer.
rnaeye is offline   Reply With Quote
Old 02-27-2014, 06:04 PM   #11
Abujamel_t
Junior Member
 
Location: Canada

Join Date: Feb 2014
Posts: 6
Default

Hello rnaeye,

The main reason is that I am expecting very limited amount of plasmid in the metagenomic DNA, and I wanted to remove the linear DNA in order to increase the chance of recovering the plasmids from the total DNA. I thought a about plasmid purification kits (and I did a couple of trails), but I think these kits are made for purifying plasmids mainly from E. coli and will not work well with other bacteria such as Gram positive (which is hard to break). Therefore, it is better to extract metagenomic DNA with very efficient methods such as chemical and mechanical lysis then try to purify the plasmid from there.

I hope I answered your question.

Cheers,
TJ
Abujamel_t is offline   Reply With Quote
Old 02-27-2014, 06:47 PM   #12
rnaeye
Member
 
Location: East Cost

Join Date: May 2011
Posts: 79
Default

Thank you Abujamel_t for information. It's very helpful. Good luck with your research.
rnaeye is offline   Reply With Quote
Reply

Tags
filtration, metagenome, paired end reads, plasmid, transposon

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:14 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO