SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
VarScan.v2.3.2 output file format problem SD2010Bioinfo Bioinformatics 7 11-20-2012 08:54 AM
VarScan input error with mpileup file ??? JackieBadger Bioinformatics 3 10-22-2012 09:15 AM
mpileup-varscan bioman1 Bioinformatics 5 06-26-2012 11:03 AM
Varscan and mpileup aunderwo Bioinformatics 2 11-05-2011 03:43 AM
bowtie - invalid CIGAR string - wrong sam format genome Bioinformatics 2 02-16-2011 01:36 PM

Reply
 
Thread Tools
Old 01-28-2013, 12:07 AM   #1
Lien
Member
 
Location: Leuven

Join Date: Dec 2009
Posts: 36
Default Invalid format for mpileup use with VarScan

Dear all,

I'm trying to call variants on RNA-seq data.
I aligned the paired-end reads with Bowtie/Tophat and then generated a mpileup file with Samtools version 0.1.16 using the following command:
samtools mpileup -q 1 -Q 13 -f human_g1k_v37.fasta test.sorted.bam > test.sorted.mpileup

This mpileup file looks ok. Then I try to run VarScan version 2.3.2, with the following code:
java -jar VarScan.v2.3.2.jar mpileup2snp /test.sorted.mpileup --min-var-freq 0.08 --p-value 0.01 > /test.sorted.mpileup.varscan

Only SNPs will be reported
Min coverage: 8
Min reads2: 2
Min var freq: 0.08
Min avg qual: 15
P-value thresh: 0.01
Reading input from /test.sorted.mpileup

Initially, everything seems fine, but then VarScan throws me this error:

Error: Invalid format for pileup at line 419813325
22 31372123 A 1018 ...,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,......,.....,,,,,,,,,,,,,,............,,,,,,,,,,,,,,.....................,,,,,,,,,,,,,,,,,,,,,,,,,,,,,.......................,,,,,,,,,,.........t,,..........................................,,,,,,,,,,,,,,,,,,,,,,,,,,,..........,,,,,,,,,,,,,,,,,,,,,,..,,,,,,,,,.,,,,.................,,...........................,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,..,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,....,,,,,,,,,,,...,,..,......................................................,,,........,,,,.....................................,,,,,,,............,,................................................,.........,.....,..............,.,,,,,........,.,,,,,,,..,,,,,,.,,,,,,,,.,,..,,,,,,,,,,,,,,,,,,,,,,,,,,,,.......,,,,,,,,,,,,,,,,...........,,,,.,,,,,,,,,,,,,,,,,,,,,,,,,......,,,,,,,..,,,,..,,,,,,,,,,,,,,,,,.............,,,,,.,,,,,,,,.....,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,.GHJHHIJH@JIEECJI39<IG:9JI<DCDCD?JJJJIFJFGJ1JJIIHC)3DAHCC9G)?JJEEJHJBJ@:JIIHIJJGFAIIJIH@IDI@H?IIIJJIGEJJGJH1IHGJEJC?@EIIIJHJ@JIJJJFEJJHIIJFJEGJJFGHJGJHJIIIJHCGJJ?GJHJ?F<ACJIGEEFJBD9DACCC?FJHJJCIJEJJIGJJJIJJG@EJAJCJHEFJID?DCDC?#CDCCDC?DGJCDGIEFEHFBJH?GJIJJGGIC?<JD?IG@JAIFIIDIJHCD<CAIIGJJJJDJ;JEJIIIJGIEJEDHJ;GJJJIJGJIGG>@IJJIJCJJJGGJJC?HAHGFHADGCGHH?3<?DDH<<DGCFFDFDF?FDF#C???D@D#FFDDCCD?(3CDC<CCCCCC<D@+#DBDC#DDDA?C#9C?D@=CD@=9

The generated varscan-file up till that line seems fine. However, I don't really know how to work around this corrupt line. Would it be easiest if I just removed that line? And how can I do this on the mpileup file? Or are there other options?

Many thanks,
Lien
Lien is offline   Reply With Quote
Old 01-30-2013, 09:03 AM   #2
dkoboldt
Member
 
Location: St. Louis

Join Date: Mar 2009
Posts: 62
Default

It's strange to suddenly encounter an invalid-format line in SAMtools mpileup output. If I take it *exactly* as you pasted it, there's no delimiter (tab) between the last base call in column 5 and the first base quality value ("G"). However, if I put a tab between those, VarScan read the line just fine.

In terms of immediate action, you could simply remove the line (99.9% of reads show no variant anyway) with grep or vi or another command-line tool. You might also want to send me lines 419813320-419813330 of your pileup file and I'll take a look.
dkoboldt is offline   Reply With Quote
Old 01-30-2013, 10:35 PM   #3
Lien
Member
 
Location: Leuven

Join Date: Dec 2009
Posts: 36
Default

Hi Dan,

I also don't know how this strange line is formed. I performed the exact same commands on similar files, and they seem to work fine.
I just removed this line, so hopefully everything will work out now. I just wasn't sure if I could just delete this file without consequence.

Thanks for your help,
Lien
Lien is offline   Reply With Quote
Old 03-14-2013, 12:18 PM   #4
vyellapa
Member
 
Location: phoenix

Join Date: Oct 2011
Posts: 58
Default

I have a similar error and am curious if the reason for this is found? Im running it in a pipeline and it would be easier if I could do something as a remedial step without having to check for the error messages.

Thank you,
Teja

Code:
Error: Invalid format for pileup at line 71
1       10277   C       0
vyellapa is offline   Reply With Quote
Old 03-19-2013, 08:19 AM   #5
dkoboldt
Member
 
Location: St. Louis

Join Date: Mar 2009
Posts: 62
Default

Vyellapa, can you send me the first 75 lines of your pileup file? Send it to dkoboldt (at) genome [dot] wustl [dot] edu
dkoboldt is offline   Reply With Quote
Old 03-20-2013, 02:47 PM   #6
dkoboldt
Member
 
Location: St. Louis

Join Date: Mar 2009
Posts: 62
Default

Hello all,

We have just released VarScan v2.3.5 which should correct the invalid mpileup warning:

https://sourceforge.net/projects/varscan/files/
dkoboldt is offline   Reply With Quote
Old 04-29-2013, 07:57 AM   #7
wdemos
Member
 
Location: Wisconsin

Join Date: Jun 2012
Posts: 26
Default v2.3.5 similar format error

I am also trying to call snps and indels using VarScan. I have the latest release (v2.3.5) installed so I can output to vcf format. I am using the following command and getting the invalid format error:

-bash-3.2$ java -jar VarScan.v2.3.5.jar mpileup2cns /research/sample.mpileup -min-coverage 8 --min-reads2 2 --min-var-freq 0.01 --min-avg-qual 15 --p-value 0.01 --strand-filter 0 --output-vcf 1 --variants 0 > /research/sample.vcf
Only variants will be reported
Min coverage: 8
Min reads2: 2
Min var freq: 0.01
Min avg qual: 15
P-value thresh: 0.01
Reading input from /research/sample.mpileup
Error: Invalid format for pileup at line 1
�BC��<�BCFYc1c2c3c4c5c6c7c8c9c10c11c12c13c14c15c16c17c18c19c20c21c22cXcYcMt5/research/Wdemos_work/sample_reordered_sorted.bam%##samtoolsVersion=0.1.17 (r973:277)

Can anyone please lend me a hand to figure out why it isn't working please? Also, I do not understand why the sorted. bam file in another directory is being referred to in the error message.

This is how I generated my pileup file:
samtools mpileup -q 1 -C50 -DSuf /ref/human_v37.fa /research/sample_reordered_sorted.bam > /research/sample.pileup

thanks

Last edited by wdemos; 03-18-2014 at 05:57 AM.
wdemos is offline   Reply With Quote
Old 04-29-2013, 11:59 AM   #8
wdemos
Member
 
Location: Wisconsin

Join Date: Jun 2012
Posts: 26
Default

the -u option was causing an issue. I was running it in a wrapper and not technically piping it in to VarScan
wdemos is offline   Reply With Quote
Old 09-03-2013, 07:20 AM   #9
dkoboldt
Member
 
Location: St. Louis

Join Date: Mar 2009
Posts: 62
Default

Thanks for letting me know!
dkoboldt is offline   Reply With Quote
Old 12-03-2013, 08:56 AM   #10
hugorody
Junior Member
 
Location: Vancouver, BC

Join Date: May 2013
Posts: 9
Default short solution

Try to use this command before you call variants:

sed -n '/\t0\t/!p' file.mpileup > file.mpileup2

use the file.mpileup2 to call variants.
hugorody is offline   Reply With Quote
Old 03-17-2014, 12:51 PM   #11
bioliyezhang
Member
 
Location: Boston, MA

Join Date: Mar 2011
Posts: 19
Default

Quote:
Originally Posted by vyellapa View Post
I have a similar error and am curious if the reason for this is found? Im running it in a pipeline and it would be easier if I could do something as a remedial step without having to check for the error messages.

Thank you,
Teja

Code:
Error: Invalid format for pileup at line 71
1       10277   C       0
Hi, Teja:

I wonder whether you solved the problem with mpileup, if so, would you mind giving me some suggestions on that? Thanks.

Best,
Liye
bioliyezhang is offline   Reply With Quote
Old 03-17-2014, 01:43 PM   #12
vyellapa
Member
 
Location: phoenix

Join Date: Oct 2011
Posts: 58
Default

Looks like Dan Kabolt fixed it the next release of Varscan. If youre still having an error with the new release too, im not sure what else could help.
vyellapa is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:23 AM.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.