SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
SOAP alignment format convert to SAM/BAM KevinLam Bioinformatics 31 01-10-2018 08:05 PM
SAM/BAM format to wiggle format pinki999 Bioinformatics 19 08-12-2015 12:35 AM
SAM to CUFFLINKS SAM format repinementer Bioinformatics 4 03-15-2012 08:53 AM
Looking process to convert gff3 format into ace format or sam format andylai Bioinformatics 1 05-17-2011 02:09 AM
anyone help me on bowtie format -> sam format! tninja Bioinformatics 2 04-25-2010 09:33 PM

Reply
 
Thread Tools
Old 06-03-2009, 12:53 AM   #61
jkbonfield
Senior Member
 
Location: Cambridge, UK

Join Date: Jul 2008
Posts: 146
Default

I'm not sure if it's the case here, but I've noticed the CIGAR string has major issues if you attempt to include gaps in the clipped sequence.

Or rather CIGAR works fine I assume, but samtools does not. (It's not really a big issue as the only time I've seen this happen is someone manually trimming an alignment back.)
jkbonfield is offline   Reply With Quote
Old 06-03-2009, 08:00 AM   #62
mikyatope
Junior Member
 
Location: Barcelona

Join Date: Jun 2009
Posts: 3
Default

Quote:
Originally Posted by zee View Post
Is there a way to convert a SAM consensus output (using -c option for pileup) to the old maq-style .cns consensus?

I have some maq-based pipelines I would like to use on my BWA results.
maybe it's related.

Is possible get the consensus sequence in a simple fasta format with SAMtools?
mikyatope is offline   Reply With Quote
Old 06-03-2009, 08:09 AM   #63
jess
Junior Member
 
Location: USA

Join Date: May 2009
Posts: 7
Default

I tried using the -c option,bt the pileup output is same evn widout this option! I gave d command smfink like dis:


samtools pileup -f ref.fasta aln_sorted.bam -s -c -v >test.pileup

Let me know wher I m gng Wrong!
jess is offline   Reply With Quote
Old 06-03-2009, 11:23 AM   #64
jess
Junior Member
 
Location: USA

Join Date: May 2009
Posts: 7
Default

ok! So i knw where i ws gng wrong...
the .aln file shud be put in last after all d options.
jess is offline   Reply With Quote
Old 06-04-2009, 04:09 AM   #65
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 693
Default

samtools.pl now updated at SVN:

http://samtools.svn.sourceforge.net/...pl?view=markup

pileup2fq is implemented, similar to maq's cns2fq. Please note that samtools.pl filters based on the RMS mapping quality (-Q) while maq's cns2fq filters on the maximum mapping quality. Also, pileup2fq masks a small region around an potential indel, but maq's cns2fq does not. The overall accuracy looks similar to maq, though.
lh3 is offline   Reply With Quote
Old 06-04-2009, 05:54 AM   #66
jess
Junior Member
 
Location: USA

Join Date: May 2009
Posts: 7
Default

Thanks Heng. I will try and let you know if I get stuck in something.
jess is offline   Reply With Quote
Old 06-04-2009, 09:36 PM   #67
corthay
Member
 
Location: japan

Join Date: Oct 2008
Posts: 25
Default

Thank you for your speedy response.

I have one more question. I got following results by using bwa(0.4.9), my favorite.
seq-name#0 69 * 0 0 * * 0 0 (sequence) (quality)
seq-name#0 133 * 0 0 * * 0 0 (sequence) (quality)

Both reads do not be mapped but the flag for "the mate is unnmapped" are 0.
How should I interpret it?
corthay is offline   Reply With Quote
Old 06-05-2009, 01:17 AM   #68
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 693
Default

This is a flaw in bwa when generating SAM. I will fixed it.

It is not so easy to generate absolutely correct SAM due to the dependency between fields and between mates. We tried to minimize the dependency in design, but reducing dependency causes inconvenience in other cases. There is always a balance.
lh3 is offline   Reply With Quote
Old 06-05-2009, 01:31 AM   #69
corthay
Member
 
Location: japan

Join Date: Oct 2008
Posts: 25
Default

I appreciate that you immediately replied to my question.
I would like to handle the sam format files.
corthay is offline   Reply With Quote
Old 06-09-2009, 07:20 AM   #70
krawitzp
Junior Member
 
Location: berlin

Join Date: Jun 2009
Posts: 1
Default genome likelihood format

Hi,
where can I find further documentation on the genome likelihood format 3.0 ?
thanks,
peter
krawitzp is offline   Reply With Quote
Old 06-25-2009, 07:36 AM   #71
ElMichael
Member
 
Location: UK

Join Date: Jun 2009
Posts: 31
Default

Hi,
could anybody, please, explain the output format of the wgsim_eval.pl script?
I used this script to evaluate aln.sam file after making alignment with BWA.
Quote:
06x 1654169 / 3308330 3308330 5.000e-01
05x 31765 / 63530 3371860 5.000e-01
04x 4938 / 9872 3381732 5.000e-01
03x 163891 / 327252 3708984 5.001e-01
02x 65120 / 129918 3838902 5.001e-01
01x 2669 / 5090 3843992 5.001e-01
00x 113748 / 141416 3985408 5.109e-01
BTW, in the BWA-man is written that " These reads are mapped with bowtie, bwa, maq and soap... The resultant alignments were then evaluated with wgsim_eval.pl script. "
How could I use this script for alignments from other programs such as bowtie, soap?
thanks,
Mike.
ElMichael is offline   Reply With Quote
Old 06-30-2009, 10:35 AM   #72
gcrdb
Junior Member
 
Location: usa

Join Date: Jan 2009
Posts: 9
Default

hi, I have trouble conveting sam to bam.. I tried both:

samtools import ref .fai in.sam out.bam
got error:
[sam_header_read2] 22 sequences loaded.
[sam_read1] reference '-143963499' is recognized as '*'.
Parse error at line 1: invalid CIGAR operation
Aborted

samtools view -bt ref .fai -o in.sam out.bam
and got similar error:
[sam_header_read2] 22 sequences loaded.
[sam_read1] reference '' is recognized as '*'.
[main_samview] truncated file.

thanks,
gcrdb is offline   Reply With Quote
Old 07-07-2009, 07:32 AM   #73
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 693
Default

Lincoln has released SAM/BAM perl APIs a few days (weeks?) ago. It is here:

http://search.cpan.org/~lds/Bio-SamT.../Bio/DB/Sam.pm

Compiling this module requires samtools C source codes. Bio::DB::Sam is known to work with samtools-0.1.4 and 0.1.5 (released today).

BTW, the latest samtools supports opening BAM files over FTP. For example:

samtools tview ftp://ftp.ncbi.nih.gov/1000genomes/f...32.2009_06.bam
lh3 is offline   Reply With Quote
Old 07-13-2009, 11:04 AM   #74
gcrdb
Junior Member
 
Location: usa

Join Date: Jan 2009
Posts: 9
Default

Bio:B::Sam perl APIs need to start from BAM files (-bam) , not SAM files(no "-sam" at all). I only have SAM files which from bwa, all I need is to convert SAM to BAM.
I am stuck with SAM files.....
samtools import ref .fai in.sam out.bam
got error:
[sam_header_read2] 22 sequences loaded.
[sam_read1] reference '-143963499' is recognized as '*'.
Parse error at line 1: invalid CIGAR operation
Aborted

thanks,
gcrdb is offline   Reply With Quote
Old 07-16-2009, 01:03 PM   #75
ohofmann
Member
 
Location: Melbourne, Australia

Join Date: Jan 2009
Posts: 37
Default

Bit of a newbie question. I've been trying to use the pileup analysis on a BWA dataset. Is there any way to switch of the read bases, read quality and alignment quality information in the output file and get a summarized format instead?

I'm looking at a small number of sequences that have a coverage of 50.000X upwards, and as a result the pileup output becomes almost unmanageable.

Thanks!
ohofmann is offline   Reply With Quote
Old 07-17-2009, 12:12 AM   #76
zee
NGS specialist
 
Location: Malaysia

Join Date: Apr 2008
Posts: 249
Default

I was looking for the same thing, however the -s also gave me the individual bases. I settled with a perl one-liner:

Code:
samtools pileup  -cf genome.fasta  file.bam |perl -lane 'print join"\t",@F[0..7]'
For more complicated things I am using the Bio:: DB::Sam pileup methods.
Hope this helps.

Quote:
Originally Posted by ohofmann View Post
Bit of a newbie question. I've been trying to use the pileup analysis on a BWA dataset. Is there any way to switch of the read bases, read quality and alignment quality information in the output file and get a summarized format instead?

I'm looking at a small number of sequences that have a coverage of 50.000X upwards, and as a result the pileup output becomes almost unmanageable.

Thanks!
zee is offline   Reply With Quote
Old 07-21-2009, 02:22 PM   #77
pparg
Member
 
Location: NY

Join Date: Aug 2008
Posts: 19
Default

I tried samtools tview to browse the alignment results in bam format. However, several keys do not work, including:
Arrows Small scroll movement
J, K Large scroll movement
backspace Scroll back one screen
..
I haven’t checked other keys. But would like to report what I saw. BTW, is there any other ways besides tview to view the alignment results in sam/bam format? Thanks a lot!
pparg is offline   Reply With Quote
Old 07-22-2009, 06:03 AM   #78
lparsons
Member
 
Location: NJ

Join Date: Nov 2008
Posts: 28
Default

I have noticed the same behavior with samtools tview. The biggest issue I have is with very deep alignments since I can find no way to scroll up or down in large chunks.

I also have a question about how people generally use tview. I have an alignment from a virus that is very deep (2000x) since it is such a small genome. I suspect this may be why there are MANY indels reported by BWA. However, most of those indels do not pass varFilter. Is there a way to regenerate the SAM/BAM file with the "false" variations removed? In other words, to view the SAM/BAM as filtered by samtools varFilter? Thanks.
lparsons is offline   Reply With Quote
Old 07-22-2009, 11:31 AM   #79
jess
Junior Member
 
Location: USA

Join Date: May 2009
Posts: 7
Default

Does anyone know of any way SAM generates a .snp file?? This is needed as the input when i use the SNPfilter option in samtools.pl...so i was wondering if i am not aware of any option which generated .snp file. Let me knw ASAP
jess is offline   Reply With Quote
Old 07-27-2009, 09:12 PM   #80
iloveneworleans
Member
 
Location: new orleans

Join Date: Jun 2009
Posts: 12
Default

I have a problem about converting sam format file into bam format file with samtools.
I downloaded the latest version:v0.1.5c
and my fasta file is about 3G; I met a "Bus error" when I run the below command,
samtools view -bt h_sapiens.fa -o output.bam h_sapiens.sam

The error message is:
[sam_header_read2] 23631565 sequence loaded.
Bus error

Does anyone meet the kind of problem before?
Could you tell me how to solve this problem?
thanks a lot
iloveneworleans is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:36 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO