Originally posted by litali
View Post
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Progress. I'm now repeating myself but:
What does "samtools idxstats" tell you? This should summarise how many reads were mapped - perhaps very few mapped.
Also the first few lines in SAM format would be interesting for diagnosis, try "samtools view example.bam | head -n 50" for the first 50 lines. If you post them here, wrap it with [ code ] and [ /code ] tags to get it to display nicely on the forum.
Comment
-
Originally posted by litali View Postit says: " loading SAM/BAM index files is not supported ....sorted.bam.bai..Load the SAM/ BAM files directly"
When I click "ok" it says "xoom in to see the alignments" but even if I soom in, it is empty
It does sound like there is a problem with the BAM file from Newbler - which is why I was suggesting looking at the file with samtools idxstats etc.
Comment
-
samtools, igv and more
when i try samtools idxstats, I receive command not found. indeed in the help menu of samtools which I installed I have only:
Program: samtools (Tools for alignments in the SAM format)
Version: 0.1.6 (r453)
Usage: samtools <command> [options]
Command: view SAM<->BAM conversion
sort sort alignment file
pileup generate pileup output
faidx index/extract FASTA
tview text alignment viewer
index index alignment
fixmate fix mate information
glfview print GLFv3 file
flagstat simple stats
calmd recalculate MD/NM tags and '=' bases
merge merge sorted alignments (Picard recommended)
rmdup remove PCR duplicates (Picard recommended)
maybe it is because this is an old version?
However, when I look at the first lines of the file as you suggested, I receive:
[ code ]
F01BJ5E01C2F11 16 chrX 138807104 100 16M1I6M1D3M1I54M1I1D104ATCTATTATGTATCAAATAAAAATAAATAAAATATTGTTCACTATTTTTTCTAGTGATAACTCAAACTAGAATCCAAAACTGCTAACAATATACTAAAGGTCAACTTCAGTGAAAAGTGAATTGGGCCGAGCGCAGAGGCTCACACCTGTAATCCCAGCACTTTGGGAAGCCGAGGCGGGCGGACA ***622....26644...(4......47======>??>6222==9:::;::1111;====5566:===8;111167=:777;;;==;77788=@?????????AABAA????AA??????A@????????><9979<<997779979==<46:1106886:<562222285888::8:9822222;
F01BJ5E01DMYWU 0 chrX 138807511 100 12M1I2M1I159M1D50M A AAAAAGAAAAAAGTAGAATTGTTTACTTTCTGTGCACCTTTTCTACTGGGACAGTCTTTCTGAGAAGTGCTTTAAGTGCATGTTAGAAGCCAGGGACATTTAGCCAGGCGTGGTGGCACGTGACTGTAGTCCCAGCTACTCAGGAGGCTGAGGTGGGAGGATCTCTTGAACGGGAAGTCAAGGCTGCAGTGAGCTATGATCATGCCACTGCACTCCAGCCTGGG C@@<<<@@CCCC@@@A@>>>@A43//////6*222--AA-@@@==<000<>---4/64<<<>>////3-<<<<<<<<<<AAA<3304;????:9:ABB444156@CCCCAAACAAAAAAAAAAA@@?AAACCCCCCCCCCCCCCCCCCAAACCA?@@CCCCCCCCCCCCCCCCCCCCCCCC?:77CCB>9999>>92222259AAAAAACCCCCCCCCCCCCCC@
[ /code]
etc...
In the IGV if I upload the bam file only it says @zoom in to see the alignments@ but it is empty...
Comment
-
Originally posted by litali View Postwhen i try samtools idxstats, I receive command not found. indeed in the help menu of samtools which I installed I have only:
Program: samtools (Tools for alignments in the SAM format)
Version: 0.1.6 (r453)
...
P.S. Once you have updated samtools, rebuild the index. The newer versions include more information in the index which is used by samtools idxstats and some viewers too.
P.P.S. You need to leave the spaces out of the [ code ] and the closing tag [ /code ] - I'm using them here otherwise you wouldn't see them, just their formatting effect.
UPDATE - see end
However, there does seem to be a problem with this 1st read - despite columns 3 to 5 saying it maps to chrX at position 138807104 with quality 100, the FLAG in column 2 is 16 (0x010 in hex) and that says reverse complemented but not mapped.
Code:F01BJ5E01C2F11 16 chrX 138807104 100 16M1I6M1D3M1I54M1I1D104 ATCTATTATGTATCAAATAAAAATAAATAAAATATTGTTCACTATTTTTTCTAGTGATAACTCAAACTAGAATCCAAAACTGCTAACAATATACTAAGGTCAACTTCAGTGAAAAGTGAATTGGGCCGAGCGCAGAGGCTCACACCTGTAATCCCAGCACTTTGGGAAGCCGAGGCGGGCGGACA ***622....26644...(4......47======>??>6222==9:::;::1111;====5566:===8;111167=:777;;;==;77788=@?????????AABAA????AA??????A@????????><9979<<997779979==<46:1106886:<562222285888::8:9822222;
Code:F01BJ5E01DMYWU 0 chrX 138807511 100 12M1I2M1I159M1D50M AAAAAAGAAAAAAGTAGAATTGTTTACTTTCTGTGCACCTTTTCTACTGGGACAGTCTTTCTGAGAAGTGCTTTAAGTGCATGTTAGAAGCCAGGGACATTTAGCCAGGCGTGGTGGCACGTGACTGTAGTCCCAGCTACTCAGGAGGCTGAGGTGGGAGGATCTCTTGAACGGGAAGTCAAGGCTGCAGTGAGCTATGATCATGCCACTGCACTCCAGCCTGGG C@@<<<@@CCCC@@@A@>>>@A43//////6*222--AA-@@@==<000<>---4/64<<<>>////3-<<<<<<<<<<AAA<3304;????:9:ABB444156@CCCCAAACAAAAAAAAAAA@@?AAACCCCCCCCCCCCCCCCCCAAACCA?@@CCCCCCCCCCCCCCCCCCCCCCCC?:77CCB>9999>>92222259AAAAAACCCCCCCCCCCCCCC@
UPDATE - Apologies, I miss read the FLAG. The mapping (or not) is set in the FLAG as bit 0x4, but I had the meaning backwards. FLAGs of 0 and 16 are mapped (forward and reverse strand). So you should see something on chrX if you zoom in.
Comment
-
Bam
ok, i installed the newer version of samtools, so idxstats gives the following:
chrX 154913754 47196 0
* 0 0 0
I also attach a few more reads from the first lines of the sam format:
Code:F01BJ5E01DBVMG 0 chrX 138807718 100 1M1D34M1I2M1I10M1I6M1I1M1I1M1D1M1I81M1I10M1I1M1I26M1I17M * 0 0 CGCACTCCAGCCTGGGCAACAGAGCAAAACCCTGTAGTACAAAAAAAAAGAAAAAAAGACGGCAGCAGCCAGGGATATGAATTAGGAGTGGGGTGGGTAGAGAGTGAGTGGGGCTGCTGGAGACAATGTTCCCATGGCACTGAACCCTGGTTAAACCAGTCTTTGAGCAAGTACTATCACTTGTCTGTAATTCCTTCTTCC =922229;=9::<<<====;:8000308686633223000388658003..55:::82020602.........2'''''''0'866600009966000069996600000034.....---3.,.3...783...334...444473......97.......474.......44.......+444....434433----38 F01BJ5E01C26ZG 0 chrX 138807757 100 3M10D96M1D1I118M1I2M AAAAAGCAGCAGCAGCCAGGGATATGAATTAGGAGTGGGGTGGGTAGAGAGTGAGTGGGGCTGCTGGAGACAATGTTCCCATGGCACTGACCCTGGTTAGCAGTCTTTGAGCAAGTACTATCACTTGCTGTAATTCCTTCTTCCTCATCCTTTGCTCCTTTTGAATATGATGATTTCTAGGAATGAACCTTCTTTATGACACATGCTGTATATTATTTTGG D=======ABDDAAA@ACCCCCCC??FA?CCCDEEEDDCCCCCCCCCCCBAAAACCCCCCCCCCC555@CCCCCCCCCCBCCCCAAABCCCCCCCCCCCCCCCCAAACCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC@@@??CCCCCBBBBCB;:444?BBBB:::B;;;;CCB;;;@> F01BJ5E01B5WV8 0 chrX 138807757 100 3M10D3M1I77M1D15M1D1I118M1I2M * 0 0 AAAAAGACAGCAGCAGCCAGGGATATGAATTAGGAGTGGGGTGGGTAGAGAGTGAGTGGGGCTGCTGGAGACAATGTTCCCATGCACTGACCCTGGTTAGCAGTCTTTGAGCAAGTACTATCACTTGCTGTAATTCCTTCTTCCTCATCCTTTGCTCCTTTTGAATATGATGATTTCTAGGAATGAACCTTCTTTATGACACATGCTGTATATTATTTTGG =922229;<9::::85888888651111.3)86<<=====<99::???<<<>>??=>>==<<<<<<+++<><??????@????<888<>><<<=:;::;=7722277777====<<<??????<<<><888:;;====66566:666:::==>??>>6677====<<<<9;:<<633468.......444234363-00::49;;;;;;A>>>>>?>>>;: F01BJ5E01CD9VD 16 chrX 138807757 100 5M8D52M1I49M1I88M1I2M1D26M * 0 0 AAAAAAAGCAGCAGCAGCCAGGGATATGAATTAGGAGTGGGGTGGGTAGAGAGTGAGGTGGGGCTGCTGGAGACAATGTTCCCATGGCACTGACCCTGGTTAACAGTTCTTTGAGCAAGTACTATCACTTGCTGTAATTCCTTCTTCCTCATCCTTTGCTCCTTTTGAATATGATGATTTCTAGGAATGAACCTTCCTTATGACACATGCTGTATATTATTTGGT 888<>ACCCCAA/95==59999AA>>99<9//--7////<<<770009A>>>>>AA>>>999999AAAA9::::AAAAAB@90033@@?@@@?ACCCCCA;;;;@AAA?@9;;99;;999>>??A;;043>>AA?1;;;;<<@?BBBCCCA@@<<<@A?@@@@@@@A??99<<<<AAACCCCCCCACCCC?@77=<CCCCCCCCAAAACCC@@<<;AAC<<<<<C F01BJ5E01DMLRK 0 chrX 138807792 100 3M1I3M1I119M1I2M1D2M1I1D19M1D1M1I6M1I25M * 0 0 GAACTTACGGAGTGGGGTGGGTAGAGAGTGAGTGGGGCTGCTGGAGACAATGTTCCCATGGCACTGACCCTGGTTAACAGTCTTTGAGCAAGTACTATCACTTGCTGTAATTCCTTCTTCCTCATCCGTTGCCCCTTTTGAATATGATGATTCCTAGGAAATGAACCTTCTTTATGACACATGCTG ;2222299<777:<:=<:::82003220220022658556.0***080000620202....222822225:<00280000299;======???<<<<====;86000006974...83433:666664444799.............79............2......34444............. F01BJ5E01DZ0XI 0 chrX 138807831 100 114M1I40M * TGGAGACAATGTTCCCATGGCACTGACCCTGGTTAACAGTCTTTGAGCAAGTACTATCACTTGCTGTAATTCCTTCTTCCTCATCCTTTGCTCCTTTTGAATATGATGATTTCTTAGGAATGAACCTTCTTTATGACACATGCTGTATATTATTT <8222288:8::8997<<9999?=??????A?<==<?@?:::::::<??=>><>>>22229589802299===>>>=?????@?>>>>=;76666990000000====;;98899;.........9799;;99;;11..06777779944.....
So, do you think it is a problem with the BAM file?
Thank you alot!!!
Comment
-
Apologies, I miss read the FLAG. The mapping (or not) is set in the FLAG as 0x4, but I had the meaning backwards. FLAGs of 0 and 16 are mapped (forward and reverse strand).
So you should see something on chrX if you zoom in - try the area where those reads are mapped, coordinate 138807718.
Comment
-
Bam
Thank you alot!
I now can see the akignments in IGV. However, I think I only see the coverage. how can I see the SNPs or the reads themselves? I now see where the reads are mapped, but it is just grey lines with black dots. If I zoom more I only see the bases in the reference but not in the reads themselves
Comment
-
Originally posted by litali View PostThank you alot!
I now can see the akignments in IGV. However, I think I only see the coverage. how can I see the SNPs or the reads themselves? I now see where the reads are mapped, but it is just grey lines with black dots. If I zoom more I only see the bases in the reference but not in the reads themselves
By default IGV shows the reads in grey where they match the reference. I think you can right click to change the colour scheme.
Alternatively, you might find another viewer more intuitive - I like Tablet http://bioinf.hutton.ac.uk/tablet/ although currently IGV handles inserts in SAM/BAM better.
Originally posted by litali View PostI found hoe to see the bases, thanx!! another question: what does it mean the "cigar" line with a number following it?
and also is it possible to use noe this sorted bam file to upload it to ucsc and see it in the viewer there?
Comment
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Today, 11:49 AM
|
0 responses
12 views
0 likes
|
Last Post
by seqadmin
Today, 11:49 AM
|
||
Started by seqadmin, Yesterday, 08:47 AM
|
0 responses
16 views
0 likes
|
Last Post
by seqadmin
Yesterday, 08:47 AM
|
||
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
61 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
60 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
Comment