SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Structural Variants jkozubek Bioinformatics 4 08-22-2016 07:38 AM
Calling structural variants (CNVs) with single-end reads agwe Genomic Resequencing 5 01-18-2016 07:30 AM
Strangely high proportion of variants are INDELs PeteH Bioinformatics 2 05-09-2012 12:13 AM
Calling structural variants from capture data Heisman Bioinformatics 3 04-16-2012 07:01 AM
cufflinks 1.2.0 version got me significantly different results than the old version slowsmile Bioinformatics 9 02-01-2012 01:26 AM

Reply
 
Thread Tools
Old 05-04-2010, 04:39 AM   #1
KaiYe
Senior Member
 
Location: amsterdam

Join Date: Jun 2009
Posts: 133
Post Pindel: improved version for indels and structural variants

hi all

Just put an improved Pindel on my website https://trac.nbic.nl/pindel/ with wiki, mail list, user manual.

An instruction using it from BWA mapping is provided. You can use it to detect indels and SVs at single-base resolution from SLX paired-end short reads.

Currently 1bp-1M bp deletions and 1bp-(read length -20)bp insertions can be detected. You can also find events of non-template insertion in deletions.

I am working on inversions and large insertions as well as using pindel for RNA-Seq data.
Please comment on Pindel and suggest additional functions.

Kai
k.ye@lumc.nl

Last edited by KaiYe; 10-28-2011 at 07:26 AM.
KaiYe is offline   Reply With Quote
Old 07-22-2010, 05:32 AM   #2
sdvie
Member
 
Location: Spain

Join Date: Jul 2010
Posts: 68
Default

Hi Kai,

Just trying out pindel for the first time...one question on the bam2pindel step: in the user manual, the input for this script is described as "aln.NameSorted.MateFixed.bam". Does this mean I have to do something additional to the bam generated by samtools? If yes, what?

Thanks,
Sophia
sdvie is offline   Reply With Quote
Old 07-22-2010, 05:45 AM   #3
KaiYe
Senior Member
 
Location: amsterdam

Join Date: Jun 2009
Posts: 133
Default

Hi Sophia,

If you generate bam from sam directly after mapping with BWA, you don't have to do anything else.

Kai
KaiYe is offline   Reply With Quote
Old 07-22-2010, 05:47 AM   #4
sdvie
Member
 
Location: Spain

Join Date: Jul 2010
Posts: 68
Default

Quote:
Originally Posted by KaiYe View Post
Hi Sophia,

If you generate bam from sam directly after mapping with BWA, you don't have to do anything else.

Kai
cool, thanks.
sdvie is offline   Reply With Quote
Old 07-26-2010, 09:45 AM   #5
raela
Member
 
Location: Ithaca, NY

Join Date: Apr 2010
Posts: 39
Default

I must be missing something.. it isn't producing any output for me, but it also isn't giving an error. I'm trying to convert my BAM file to the pindel format:

[heather@frankie (Mon Jul 26 13:33:04)]% bam2pindel.pl -i aln.sort.pindel.bam -o out.pindel -s retina -om -pi 150
[heather@frankie (Mon Jul 26 13:41:02)]% ls
. aln.sort.fix.bam.bai horse_genome_v2_all.fa.ann horse_genome_v2_all.fa.rsa
.. aln.sort.pindel.bam horse_genome_v2_all.fa.bwt horse_genome_v2_all.fa.sa
align.sort.bam aln_read1.sai horse_genome_v2_all.fa.fai tag_trim_6_1.fq
aln.bam aln_read2.sai horse_genome_v2_all.fa.pac tag_trim_6_2.fq
aln.sam horse_genome_v2_all.fa horse_genome_v2_all.fa.rbwt
aln.sort.fix.bam horse_genome_v2_all.fa.amb horse_genome_v2_all.fa.rpac
raela is offline   Reply With Quote
Old 08-27-2010, 02:47 PM   #6
wuhoucdc
Member
 
Location: Nashville

Join Date: Oct 2009
Posts: 14
Default

Hi Kai,

Can Pintel call large structural variants (>1M) now?

Thanks.

Wu

Last edited by wuhoucdc; 08-27-2010 at 02:55 PM.
wuhoucdc is offline   Reply With Quote
Old 09-03-2010, 05:44 PM   #7
tinacai
Member
 
Location: china

Join Date: Apr 2010
Posts: 18
Default

Dear Kai Ye,
I've used the pindel software recently. I have heared that you have published a new version software, will you please give me the linkage please.




Best,
Cong chen
Wenzhou Medical College
tinacai is offline   Reply With Quote
Old 09-21-2010, 04:08 AM   #8
KaiYe
Senior Member
 
Location: amsterdam

Join Date: Jun 2009
Posts: 133
Default

Quote:
Originally Posted by tinacai View Post
Dear Kai Ye,
I've used the pindel software recently. I have heared that you have published a new version software, will you please give me the linkage please.




Best,
Cong chen
Wenzhou Medical College
Hi Cong Chen,

It seems to me that I have sent you my latest Pindel for test. Have you experienced any problems in using it?

Kai
KaiYe is offline   Reply With Quote
Old 09-21-2010, 04:12 AM   #9
KaiYe
Senior Member
 
Location: amsterdam

Join Date: Jun 2009
Posts: 133
Default

Quote:
Originally Posted by wuhoucdc View Post
Hi Kai,

Can Pintel call large structural variants (>1M) now?

Thanks.

Wu
Pindel can detect variants of any sizes as long as they are not inter-chromosome events. The only thing I worry about is speed. The runtime is linear to the maximum size of SVs.

I am currently testing a new version with the following additional functions:
1. Allow sequence errors/SNPs in the same reads containing INDELs/SVs
2. non-template sequence in deletions
3. inversions
4. tandem duplications
5. breakpoints of large insertions

Please send me an email for ask for it in case you want to test it.

Cheers,

Kai
KaiYe is offline   Reply With Quote
Old 09-21-2010, 04:13 AM   #10
KaiYe
Senior Member
 
Location: amsterdam

Join Date: Jun 2009
Posts: 133
Default

Quote:
Originally Posted by raela View Post
I must be missing something.. it isn't producing any output for me, but it also isn't giving an error. I'm trying to convert my BAM file to the pindel format:

[heather@frankie (Mon Jul 26 13:33:04)]% bam2pindel.pl -i aln.sort.pindel.bam -o out.pindel -s retina -om -pi 150
[heather@frankie (Mon Jul 26 13:41:02)]% ls
. aln.sort.fix.bam.bai horse_genome_v2_all.fa.ann horse_genome_v2_all.fa.rsa
.. aln.sort.pindel.bam horse_genome_v2_all.fa.bwt horse_genome_v2_all.fa.sa
align.sort.bam aln_read1.sai horse_genome_v2_all.fa.fai tag_trim_6_1.fq
aln.bam aln_read2.sai horse_genome_v2_all.fa.pac tag_trim_6_2.fq
aln.sam horse_genome_v2_all.fa horse_genome_v2_all.fa.rbwt
aln.sort.fix.bam horse_genome_v2_all.fa.amb horse_genome_v2_all.fa.rpac
Would you please inform me your email address? I have cpp code to extract reads from sam files for Pindel.

Thanks.
KaiYe is offline   Reply With Quote
Old 12-06-2010, 07:52 PM   #11
jtjli
Member
 
Location: australia

Join Date: Nov 2008
Posts: 21
Default help

Hi KaiYe

I'm having problem running Pindel. Here's what I've done:
1) Download all files from http://www.ebi.ac.uk/~kye/pindel/v_0.2.0/
2) ran bam2pindel.pl on one paired-end samples (aligned using BWA). My bam file is sorted but it does not have the header expected by your program, so i used the -om to force the script to run.
a number of files is generated: e.g. myprefix.1.txt (chr1)
3) then I tried running pindel_x86_64, but i then got this error message: ./pindel_x86_64: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.9' not found (required by ./pindel_x86_64)
4) i tried upgrading some packages in my redhat linux, but still the same.
5) i then downloaded your source code from sourceforge (with svn) and compiled your pindel from scratch. It seems to work.
6) I find the "-i" parameter confusing as it says "-i, --config-file: the bam file later to be a config file;" in the script but "Input: the unmapped reads in a modified fastq format" in your powerpoint manual.
7) I assumed -i refers to the files generated by bam2pindel.pl, so i tested the command on some chromosomes. E.g.
pindel_64 -f hg19.fasta -i myprefix.1.txt -o otherprefix -c 1 -b empty.txt
8) but whichever chromosome i try, i always get "There are no reads for this chromosome":

Processing chromosome 1
Processing chromosome 2
Skip the rest of chromosomes.
1 249250621 269250621
26926 10000
BreakDancer events: 0
There are no reads for this chromosome.


What have i done wrong?

my email is jason.li @ petermac.org

Thanks
Jason
jtjli is offline   Reply With Quote
Old 12-06-2010, 11:04 PM   #12
rwenang
Member
 
Location: Singapore

Join Date: Jan 2009
Posts: 31
Default

Hi Kaiye,

interesting tool you got there. anyway, have you publish the method? I am curious about one thing, say you have 1 read, you will grow the pattern until you cannot get a match, then you find the rest of the read within the next 1-1M bps. What if there are several matches in the 1-1M bps region, which one do you use and what kind of consideration do you use to choose it?
rwenang is offline   Reply With Quote
Old 12-07-2010, 08:16 AM   #13
KaiYe
Senior Member
 
Location: amsterdam

Join Date: Jun 2009
Posts: 133
Default

Quote:
Originally Posted by rwenang View Post
Hi Kaiye,

interesting tool you got there. anyway, have you publish the method? I am curious about one thing, say you have 1 read, you will grow the pattern until you cannot get a match, then you find the rest of the read within the next 1-1M bps. What if there are several matches in the 1-1M bps region, which one do you use and what kind of consideration do you use to choose it?
Yes, Pindel has been published (http://www.ncbi.nlm.nih.gov/pubmed/19561018) and it was awarded best paper at ISMB 2009 Special Interest Group on Short Read Sequencing.


Only unique hit will be considered here.
KaiYe is offline   Reply With Quote
Old 12-07-2010, 08:18 AM   #14
KaiYe
Senior Member
 
Location: amsterdam

Join Date: Jun 2009
Posts: 133
Default

Quote:
Originally Posted by jtjli View Post
Hi KaiYe

I'm having problem running Pindel. Here's what I've done:
1) Download all files from http://www.ebi.ac.uk/~kye/pindel/v_0.2.0/
2) ran bam2pindel.pl on one paired-end samples (aligned using BWA). My bam file is sorted but it does not have the header expected by your program, so i used the -om to force the script to run.
a number of files is generated: e.g. myprefix.1.txt (chr1)
3) then I tried running pindel_x86_64, but i then got this error message: ./pindel_x86_64: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.9' not found (required by ./pindel_x86_64)
4) i tried upgrading some packages in my redhat linux, but still the same.
5) i then downloaded your source code from sourceforge (with svn) and compiled your pindel from scratch. It seems to work.
6) I find the "-i" parameter confusing as it says "-i, --config-file: the bam file later to be a config file;" in the script but "Input: the unmapped reads in a modified fastq format" in your powerpoint manual.
7) I assumed -i refers to the files generated by bam2pindel.pl, so i tested the command on some chromosomes. E.g.
pindel_64 -f hg19.fasta -i myprefix.1.txt -o otherprefix -c 1 -b empty.txt
8) but whichever chromosome i try, i always get "There are no reads for this chromosome":

Processing chromosome 1
Processing chromosome 2
Skip the rest of chromosomes.
1 249250621 269250621
26926 10000
BreakDancer events: 0
There are no reads for this chromosome.


What have i done wrong?

my email is jason.li @ petermac.org

Thanks
Jason
I will send you my source code via email.
KaiYe is offline   Reply With Quote
Old 02-27-2011, 03:58 PM   #15
Fabrice ODEFREY
Member
 
Location: Melbourne

Join Date: May 2010
Posts: 21
Default

Hi KaiYe,

I'm working with SOLiD data...and would like to use Pindel but couldn't find anything about it. is Pindel only for Illumina data?
thanks in advance for your reply.
Fabrice
Fabrice ODEFREY is offline   Reply With Quote
Old 02-28-2011, 12:45 AM   #16
KaiYe
Senior Member
 
Location: amsterdam

Join Date: Jun 2009
Posts: 133
Default

Quote:
Originally Posted by Fabrice ODEFREY View Post
Hi KaiYe,

I'm working with SOLiD data...and would like to use Pindel but couldn't find anything about it. is Pindel only for Illumina data?
thanks in advance for your reply.
Fabrice
hi Fabrice,

I don't have a procedure with SOLiD data but would explore this together with you.

First you need to convert the data from color space to sequence space.

Second, convert the sequence to the correct strand. Pindel assume the data is paired-end as illumina so that the reads are facing each other rather than on the same strand.

You may then try my sam2pindel.cpp to extract reads and run Pindel.

Please visit https://trac.nbic.nl/pindel and register as a Pindel user.

Kai
KaiYe is offline   Reply With Quote
Old 02-28-2011, 12:56 AM   #17
Fabrice ODEFREY
Member
 
Location: Melbourne

Join Date: May 2010
Posts: 21
Default

thanks a lot Kai for your quick reply.
for the second step is there a tool to do that?
thanks again!
Fabrice
Fabrice ODEFREY is offline   Reply With Quote
Old 02-28-2011, 01:05 AM   #18
KaiYe
Senior Member
 
Location: amsterdam

Join Date: Jun 2009
Posts: 133
Default

Quote:
Originally Posted by Fabrice ODEFREY View Post
thanks a lot Kai for your quick reply.
for the second step is there a tool to do that?
thanks again!
Fabrice
The second step is rather straight forward but requires knowledge on your SOLiD data about the strand. I know that some SOLiD data satisfies the second requirement without any modification but the others need additional a converting step.

You may need to write a script to do that.

Kai
KaiYe is offline   Reply With Quote
Old 02-28-2011, 01:07 AM   #19
Fabrice ODEFREY
Member
 
Location: Melbourne

Join Date: May 2010
Posts: 21
Default

alright, that's what I thought, thanks!
Fabrice
Fabrice ODEFREY is offline   Reply With Quote
Old 05-09-2011, 11:31 PM   #20
chariko
Member
 
Location: Spain

Join Date: Jun 2010
Posts: 56
Default

Quote:
Originally Posted by KaiYe View Post
I will send you my source code via email.
Hi KaiYe,

I am having the same problem as jtjli (http://seqanswers.com/forums/showthr...0820#post30820). I did as follows:

1) Download all files from http://www.ebi.ac.uk/~kye/pindel/v_0.2.0/. I aligned with BWA, processed with samtools and filtered by MAPQ quality (<30).
2) ran bam2pindel.pl on one paired-end samples (aligned using BWA). My bam file is sorted and duplicates are removed but it does not have the header expected by your program, so i used the -om to force the script to run. A file for each chromosome was generated: e.g. myprefix.1.txt (chr1)
3) I downloaded your source code from sourceforge (with svn) and compiled your pindel from scratch. It seems to work.
4) I run the following comand
/home/Pindel_source_v0.2.2/pindel -f /home/hg19.fa -i /s_4_QC_sort_pind.bam_chr1.txt -o ./s4 -c chr1 empty

but whichever chromosome i try, i always get "There are no reads for this chromosome":

BreakDancer events: 0
Processing chromosome: chr10
Skipping chromosome: chr10
...

Processing chromosome: chr1
Chromosome Size: 249250621
26926 10000
Looking at chromosome chr1 bases 0 to 10000000.
BinBorder 0 10000000
There are no reads for this bin.
Looking at chromosome chr1 bases 10000000 to 20000000.
BinBorder 10000000 20000000
There are no reads for this bin.
....
Loading genome sequences and reads: 0 seconds.
Mining, Sorting and output results: 0 seconds.

What I am doing wrong? How did you solve jtjli's problem?

Last edited by chariko; 05-09-2011 at 11:36 PM. Reason: Incomplete
chariko is offline   Reply With Quote
Reply

Tags
pindel

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:04 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO