![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
BWA fail to open file | wisosonic | Bioinformatics | 12 | 06-06-2012 01:46 AM |
Rsamtools Bam file reading error | dab32 | Bioinformatics | 0 | 11-07-2011 04:21 AM |
Question about samtools view -r? | syedsaid | Bioinformatics | 0 | 09-29-2011 03:00 AM |
Samtools view | michalkovac | Bioinformatics | 2 | 07-19-2011 06:25 AM |
When is Open reading frame=gene? | ritzriya | RNA Sequencing | 4 | 10-06-2010 09:10 PM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Germany Join Date: Jul 2010
Posts: 11
|
![]()
Hi, all
Every now and then when I am trying to convert .sam file into .bam file by calling Code:
samtools view -bT hg.fa -o xxx.bam xxx.sam Code:
[main_samview] fail to open file for reading. Code:
@HD VN:1.0 SO:sorted @PG ID:TopHat VN:1.0.13 CL:/scratch/ngsvin/ruping/CancerGenomics/tophat-1.0.13/bin/tophat -o /scratch/ngsvin/RNA-seq/MPI-NF/mimik_pairend/ --solexa1.3-quals -p 5 -r 46 --mate-std-dev 14 --segment-length 20 -G /scratch/ngsvin/RNA-seq/MPI-NF/Hs.genes.gff /scratch/ngsvin/ruping/CancerGenomics/bowtie-0.12.5/indexes/hg18 s_4_1fq.chopped s_4_2fq.chopped Run0009Lane4Tile57x3887y5410Multi0 65 chr1 461 255 36M = 154912309 154911848 CTAACCCTGGCGGTACCCTCAGCCGGCCCGCCCGCC GGAEGGGGGFGGFGDGGGGG?FFFFGFGGGFGGGFG NM:i:1 Run0009Lane4Tile28x19254y9909Multi0 73 chr1 537 0 36M * 0 0 ACCACCGAAATCTGTGCAGAGGAGAACGCAGCTCCG CGGDGGGFGGFGGGGGFGGGGGGFGGGGEGGGGGGG NM:i:1 Run0009Lane4Tile119x16602y20937Multi0 161 chr1 2792 255 36M = 3160 403 CTACAAGCAGCAAACAGTCTGCATGGGTCATCCCCT FEFFFFEFFFFFFFFCFDFFEFAFFFFEFFEDFFED NM:i:0 Run0009Lane4Tile48x11762y17580Multi0 147 chr1 3112 255 36M = 3130 -17 TGCCAGCATAGTGCTCCTGGACCAGCGATACGCCCG EGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG NM:i:2 Run0009Lane4Tile24x15875y8494Multi0 83 chr1 3113 255 36M = 3120 -28 GCCAGCATAGTGCTCCTGGACCAGCGATACGCCCGG 3>:.@+,31@56/?50;>CBB0)6@766-67/6@77 NM:i:2 In contrast, I did successfully convert some other .sam file into .bam file and the header looks exactly the same of the above one. The only difference maybe the file size. The above .sam file is very big (10GB), but however I have sufficient memory to load it (>250GB memory). So, It is quite confusing to me that I always get some error like this, I was trying to understand the C code of sam.C but I couldn't figure out what's the problem, can anyone help me? Thanks a lot! ![]() |
![]() |
![]() |
![]() |
#2 |
Peter (Biopython etc)
Location: Dundee, Scotland, UK Join Date: Jul 2009
Posts: 1,543
|
![]()
Have you tried taking just the start of this big SAM file (i.e. the header and say the first 20 reads). This should tell you if it is the header that is the problem, rather than the file size.
|
![]() |
![]() |
![]() |
#3 |
Senior Member
Location: Rockville, MD Join Date: Jan 2009
Posts: 126
|
![]()
Try -bT <in.bam> -o <out.sam>
|
![]() |
![]() |
![]() |
#4 | |
Member
Location: Germany Join Date: Jul 2010
Posts: 11
|
![]() Quote:
That's a good point. I tryed and it works for the chopped small file: Code:
head -100 xxx.sam >test.sam samtools view -bT hg.fa test.sam >test.bam [sam_header_read2] 25 sequences loaded. So that means I can not convert large sam files into bam? |
|
![]() |
![]() |
![]() |
#5 |
Peter (Biopython etc)
Location: Dundee, Scotland, UK Join Date: Jul 2009
Posts: 1,543
|
![]()
So at least you know the header is OK. It could be that there is a corrupt or otherwise problematic read later in the SAM file. Can you break the SAM file into chunks to explore this possibility?
I'd also suggest adding some debug statements to samtools, recompile, and re-test. |
![]() |
![]() |
![]() |
#6 | |
Member
Location: Germany Join Date: Jul 2010
Posts: 11
|
![]() Quote:
|
|
![]() |
![]() |
![]() |
#7 |
Member
Location: Iowa City, IA Join Date: Jul 2010
Posts: 95
|
![]() Code:
samtools import hg.fa xxx.sam xxx.bam |
![]() |
![]() |
![]() |
#8 |
Member
Location: Germany Join Date: Jul 2010
Posts: 11
|
![]() |
![]() |
![]() |
![]() |
#9 |
Nils Homer
Location: Boston, MA, USA Join Date: Nov 2008
Posts: 1,285
|
![]()
"samtools view -S" reads in a SAM file, "samtools view" (without the "-S") does not.
|
![]() |
![]() |
![]() |
#10 | |
Member
Location: Germany Join Date: Jul 2010
Posts: 11
|
![]() Quote:
I "headed" different number of lines into a new file and then tested whether it works for the conversion, I found: Code:
head -13394305 xxx.sam >head.sam samtools view -bST hg18.fa head.sam -o head.bam [sam_header_read2] 25 sequences loaded. head -13394306 xxx.sam >head.sam samtools view -bST hg18.fa head.sam -o head.bam [main_samview] fail to open file for reading. Interestingly, if I look into the differences between the file size: Code:
-rw------- 1 ruping xxx 2.0G Aug 4 17:42 head.sam (for 13394305 lines) -rw------- 1 ruping xxx 2.1G Aug 4 17:43 head.sam (for 13394306 lines) So, what do you think? ![]() Last edited by ruping; 08-04-2010 at 09:08 AM. |
|
![]() |
![]() |
![]() |
#11 |
Member
Location: Ann Arbor, MI Join Date: Oct 2008
Posts: 57
|
![]()
I had a similar issue with tview where it couldn't find the .sai index file. Running samtools index [whatever] fixed the issue.
|
![]() |
![]() |
![]() |
#12 |
Member
Location: Germany Join Date: Jul 2010
Posts: 11
|
![]()
I should mention that the version of the samtools I'm using is 0.1.8.
There is an interesting thing happened, I tried another version of samtools (0.1.7-6 (r530)), and now it works! But this doesn't give me a scientific explanation... Code:
/home/somebody/samtools/samtools view -bST hg18.fa head.sam -o head.bam [sam_header_read2] 25 sequences loaded. |
![]() |
![]() |
![]() |
#13 |
Member
Location: Nashville Join Date: Oct 2009
Posts: 14
|
![]()
Hi ruping,
So that means I can not convert large sam files into bam?[/QUOTE] I think you can convert sam files as large as possible to bam. I have tried a sam file more than 100G. Wu |
![]() |
![]() |
![]() |
Thread Tools | |
|
|