<?xml version="1.0" encoding="ISO-8859-1"?>

<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/">
	<channel>
		<title>SEQanswers - Bioinformatics</title>
		<link>http://seqanswers.com/forums/</link>
		<description>Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc</description>
		<language>en</language>
		<lastBuildDate>Wed, 08 Sep 2010 16:47:31 GMT</lastBuildDate>
		<generator>vBulletin</generator>
		<ttl>10</ttl>
		<image>
			<url>http://seqanswers.com/forums/images/misc/rss.jpg</url>
			<title>SEQanswers - Bioinformatics</title>
			<link>http://seqanswers.com/forums/</link>
		</image>
		<item>
			<title>samtools pileup output for SNP detection</title>
			<link>http://seqanswers.com/forums/showthread.php?t=6757&amp;goto=newpost</link>
			<pubDate>Wed, 08 Sep 2010 15:56:15 GMT</pubDate>
			<description>Hi All, 
I aligned 50 bp solexa reads onto a reference genome using bowtie and now I am using samtools pileup -vcf to call SNPs. This is the output...</description>
			<content:encoded><![CDATA[<div>Hi All,<br />
I aligned 50 bp solexa reads onto a reference genome using bowtie and now I am using samtools pileup -vcf to call SNPs. This is the output it generates:<br />
scaffold179	29989	T	A	30	30	60	1	A	B<br />
scaffold179	29991	A	G	4	4	60	1	G	#<br />
scaffold179	29992	A	T	4	4	60	1	T	#<br />
scaffold179	29993	T	A	4	4	60	1	N	#<br />
scaffold179	29994	T	A	4	4	60	1	N	#<br />
scaffold179	29995	A	C	4	4	60	1	C	#<br />
scaffold179	29997	T	A	4	4	60	1	N	#<br />
scaffold179	29998	G	C	4	4	60	1	C	#<br />
scaffold179	29999	T	C	4	4	60	1	C	#<br />
scaffold179	30000	T	A	4	4	60	1	N	#<br />
scaffold179	30001	T	A	4	4	60	1	A	#<br />
scaffold179	30002	G	A	4	4	60	1	A	#<br />
scaffold179	30003	A	G	4	4	60	1	G	#<br />
scaffold179	30004	T	C	4	4	60	1	C	#<br />
scaffold179	30005	T	A	4	4	60	1	A	#<br />
scaffold179	30006	T	A	4	4	60	1	N	#<br />
scaffold179	30007	G	A	4	4	60	1	N	#<br />
scaffold179	30008	T	A	4	4	60	1	N	#<br />
scaffold179	30009	G	A	4	4	60	1	N	#<br />
scaffold179	30010	T	A	4	4	60	1	N	#<br />
scaffold179	30011	T	A	4	4	60	1	N	#<br />
scaffold179	30012	T	A	4	4	60	1	N	#<br />
scaffold179	30013	T	A	4	4	60	1	N	#<br />
scaffold179	30014	G	A	4	4	60	1	N	# <br />
<br />
I understand that columns from left to right are:<br />
Scaffold name:<br />
coordinate:<br />
reference base:<br />
polymorphic base:<br />
<br />
I am confused about the numbers after the 4th column. Please suggest as to what they correspond to.<br />
Also when I try to look at the alignment in tview, it shows only Ns after the first 80 pb. My reference genome does contain some short stretches of Ns but is not all Ns so why does the samtools tview shows only Ns?<br />
Any suggestion would be highly appreciated.<br />
Thanks for all the help.<br />
Thanks.</div>

]]></content:encoded>
			<category domain="http://seqanswers.com/forums/forumdisplay.php?f=18">Bioinformatics</category>
			<dc:creator>Mansequencer</dc:creator>
			<guid isPermaLink="true">http://seqanswers.com/forums/showthread.php?t=6757</guid>
		</item>
		<item>
			<title>Lower case characters in FASTa reference sequence</title>
			<link>http://seqanswers.com/forums/showthread.php?t=6756&amp;goto=newpost</link>
			<pubDate>Wed, 08 Sep 2010 15:52:20 GMT</pubDate>
			<description>Hi, I have noticed in HG19, there are sometimes a segment of sequence that are in lower case? Does this have a special meaning? 
 
Thanks</description>
			<content:encoded><![CDATA[<div>Hi, I have noticed in HG19, there are sometimes a segment of sequence that are in lower case? Does this have a special meaning?<br />
<br />
Thanks</div>

]]></content:encoded>
			<category domain="http://seqanswers.com/forums/forumdisplay.php?f=18">Bioinformatics</category>
			<dc:creator>foxyg</dc:creator>
			<guid isPermaLink="true">http://seqanswers.com/forums/showthread.php?t=6756</guid>
		</item>
		<item>
			<title>How to create a SAM/BAM file from scratch</title>
			<link>http://seqanswers.com/forums/showthread.php?t=6751&amp;goto=newpost</link>
			<pubDate>Wed, 08 Sep 2010 12:28:15 GMT</pubDate>
			<description>Hi 
Does anybody have written a script or programme that creates a SAM or BAM file based on the following alignment information: 
 
1) 454-reads 
2)...</description>
			<content:encoded><![CDATA[<div>Hi<br />
Does anybody have written a script or programme that creates a SAM or BAM file based on the following alignment information:<br />
<br />
1) 454-reads<br />
2) read qualtities<br />
3) reference sequence(s)<br />
4) read alignment information (e.g. read X aligns from (read-)position X_i to  position X_j with reference Y, position Y_i to Y_j )<br />
<br />
Many thanks for suggestions!</div>

]]></content:encoded>
			<category domain="http://seqanswers.com/forums/forumdisplay.php?f=18">Bioinformatics</category>
			<dc:creator>DNAjunk</dc:creator>
			<guid isPermaLink="true">http://seqanswers.com/forums/showthread.php?t=6751</guid>
		</item>
		<item>
			<title>How to manage overlapping paired-end reads?</title>
			<link>http://seqanswers.com/forums/showthread.php?t=6745&amp;goto=newpost</link>
			<pubDate>Wed, 08 Sep 2010 08:38:45 GMT</pubDate>
			<description>Hello, 
This is my first time analysing RNA-Seq data; so I would appreciate your help with a couple of issues. 
My data consist of 8 samples...</description>
			<content:encoded><![CDATA[<div>Hello,<br />
This is my first time analysing RNA-Seq data; so I would appreciate your help with a couple of issues.<br />
My data consist of 8 samples sequenced using Illumina’s standard paired-end RNA-seq protocol (which unfortunately at the time was NOT carried out in a strand-specific manner). The fragment size was 220bp which left around 100 bp after subtraction of adapters (2X58-60bp). Each pair (60 bp) in the dataset therefore has an approximately 20bp overlap.<br />
I would like to use the bowtie/tophat/cufflinks pipeline and have a couple of questions regarding the analysis:<br />
1. Although this post was very informative, I cannot decide which of these is a better strategy for analysing reads in my case:<br />
a. Assembling the paired-end reads into 100bp single reads before…<br />
b. Directly using the paired-end reads in tophat [This generates a negative (-20) inner distance between pairs but version 1.0.13 onwards seems to be able to handle this scenario (the –r option, I’m right???)].<br />
2. Since the Illumina protocol was not strand-specific, is it a good idea to convert (correct word?) all the resulting mapping data or the sequence reads (after an initial round of mapping) so that it matches a single strand of the genome? I wonder if this strategy will help cufflinks better assemble and quantify the transcripts…<br />
Thank you very much in advance for your help/suggestions/feedback…<br />
Fred</div>

]]></content:encoded>
			<category domain="http://seqanswers.com/forums/forumdisplay.php?f=18">Bioinformatics</category>
			<dc:creator>FredOnSeq</dc:creator>
			<guid isPermaLink="true">http://seqanswers.com/forums/showthread.php?t=6745</guid>
		</item>
		<item>
			<title>variant annotation</title>
			<link>http://seqanswers.com/forums/showthread.php?t=6744&amp;goto=newpost</link>
			<pubDate>Wed, 08 Sep 2010 04:12:12 GMT</pubDate>
			<description>Hi, 
Most of the tools related to variant annotations seem to be related to humans. I work on a plant species where we  only have the draft genome...</description>
			<content:encoded><![CDATA[<div>Hi,<br />
Most of the tools related to variant annotations seem to be related to humans. I work on a plant species where we  only have the draft genome and not annotated yet. I have done an rna-seq experiment and I have found several snps using samtools. I got a gtf file by aligning reads against the genome sequence with bowtie and tophat. The gtf file only has transcirpt information and no orf or CDS information. Does any one have a script which takes the positions of the snps, annotations from a gtf file and the genome sequence in fasta format and predict if the snps are synonymous or non-synonymous?</div>

]]></content:encoded>
			<category domain="http://seqanswers.com/forums/forumdisplay.php?f=18">Bioinformatics</category>
			<dc:creator>Balat</dc:creator>
			<guid isPermaLink="true">http://seqanswers.com/forums/showthread.php?t=6744</guid>
		</item>
		<item>
			<title>SpliceMap 3.3.3.x released: Tool for spliced read alignment</title>
			<link>http://seqanswers.com/forums/showthread.php?t=6742&amp;goto=newpost</link>
			<pubDate>Tue, 07 Sep 2010 22:34:52 GMT</pubDate>
			<description>Hi Everyone, 
 
SpliceMap 3.3.3.x has been release with many user requested features and stability/reliability enhancements.  
 
*Website:...</description>
			<content:encoded><![CDATA[<div>Hi Everyone,<br />
<br />
SpliceMap 3.3.3.x has been release with many user requested features and stability/reliability enhancements. <br />
<br />
<b>Website: </b><a href="http://www.stanford.edu/group/wonglab/SpliceMap/" target="_blank">http://www.stanford.edu/group/wonglab/SpliceMap/</a><br />
<br />
The following is a summary of the major changes.<br />
<ul><li> High specificity in junction identification</li>
<li> Alignments of reads with variable length in both pairs (eg. after trimming)</li>
<li> Support for concatenated genome files</li>
<li> Bowtie index automatically built if not found</li>
<li> Read quality and names copied to SAM file</li>
<li> Option to run multiple chromosomes at same time</li>
<li> Option to change location of temp and output directories</li>
<li> and more... (see website)</li>
</ul><br />
A useful feature of SpliceMap I would just like to point out is that each junction is tagged with the number of non-redundant reads so that the reliability of each junction can be judged. <br />
<br />
The following is some example output of junctions (not full alignment) from cisGenome browser.<br />
<br />
<img src="http://www.stanford.edu/group/wonglab/SpliceMap/images/cisgenome.png" border="0" alt="" /><br />
<br />
Novel junctions are colored and possibly unreliable ones are colored more lightly. <br />
<br />
Enjoy!</div>

]]></content:encoded>
			<category domain="http://seqanswers.com/forums/forumdisplay.php?f=18">Bioinformatics</category>
			<dc:creator>john_mu</dc:creator>
			<guid isPermaLink="true">http://seqanswers.com/forums/showthread.php?t=6742</guid>
		</item>
		<item>
			<title>rna2map to blastparsed</title>
			<link>http://seqanswers.com/forums/showthread.php?t=6741&amp;goto=newpost</link>
			<pubDate>Tue, 07 Sep 2010 22:12:19 GMT</pubDate>
			<description><![CDATA[Hello all. I've used rna2map to do an alignment, and now I want to feed the output to various tools that expect BLAST formatted input. How can I do...]]></description>
			<content:encoded><![CDATA[<div>Hello all. I've used rna2map to do an alignment, and now I want to feed the output to various tools that expect BLAST formatted input. How can I do that conversion? Thanks in advance.</div>

]]></content:encoded>
			<category domain="http://seqanswers.com/forums/forumdisplay.php?f=18">Bioinformatics</category>
			<dc:creator>ilyagoldin</dc:creator>
			<guid isPermaLink="true">http://seqanswers.com/forums/showthread.php?t=6741</guid>
		</item>
		<item>
			<title>Scripture score function</title>
			<link>http://seqanswers.com/forums/showthread.php?t=6740&amp;goto=newpost</link>
			<pubDate>Tue, 07 Sep 2010 19:45:10 GMT</pubDate>
			<description>Hi, Everyone, 
 
I am trying to use Scripture to do transcript abundance estimation, like Cufflinks does. I found there is a score function in the...</description>
			<content:encoded><![CDATA[<div>Hi, Everyone,<br />
<br />
I am trying to use Scripture to do transcript abundance estimation, like Cufflinks does. I found there is a score function in the guide, so I simply use it with a transcription annotation database i have. I then use the RPKM column to calculate the expression of that transcript...But I after i compared the results with the outputs of cufflinks, they are not correlated....Did I used the wrong column, anyone has any suggestions? Thanks a lot!<br />
<br />
<a href="http://www.broadinstitute.org/software/scripture/Score%20task" target="_blank">http://www.broadinstitute.org/softwa...e/Score%20task</a><br />
<br />
above is the link to the guide i mentioned.</div>

]]></content:encoded>
			<category domain="http://seqanswers.com/forums/forumdisplay.php?f=18">Bioinformatics</category>
			<dc:creator>wuj</dc:creator>
			<guid isPermaLink="true">http://seqanswers.com/forums/showthread.php?t=6740</guid>
		</item>
		<item>
			<title>UCSC for Arabidopsis</title>
			<link>http://seqanswers.com/forums/showthread.php?t=6738&amp;goto=newpost</link>
			<pubDate>Tue, 07 Sep 2010 19:33:45 GMT</pubDate>
			<description>Hello All, 
 
I would like to load the genome of Arabidopsis in UCSC browser using custom tracks. I am using the gff file available on the internet...</description>
			<content:encoded><![CDATA[<div>Hello All,<br />
<br />
I would like to load the genome of Arabidopsis in UCSC browser using custom tracks. I am using the gff file available on the internet from <a href="ftp://ftp.arabidopsis.org/home/tair/Genes/TAIR9_genome_release/TAIR9_gff3/TAIR9_GFF3_genes_transposons.gff" target="_blank">ftp://ftp.arabidopsis.org/home/tair/...ransposons.gff</a>  <br />
<br />
Does anyone know how to load genome for Arabidopsis in UCSC browser?<br />
<br />
Thanks,</div>

]]></content:encoded>
			<category domain="http://seqanswers.com/forums/forumdisplay.php?f=18">Bioinformatics</category>
			<dc:creator>BioTalk</dc:creator>
			<guid isPermaLink="true">http://seqanswers.com/forums/showthread.php?t=6738</guid>
		</item>
		<item>
			<title>Bismark Bisulfite Aligner - Now supporting CpG, CHG and CHH context</title>
			<link>http://seqanswers.com/forums/showthread.php?t=6731&amp;goto=newpost</link>
			<pubDate>Tue, 07 Sep 2010 15:33:03 GMT</pubDate>
			<description>I would like to announce that we have just released a new version of our Bisulfite-seq alignment and methylation calling tool Bismark. All associated...</description>
			<content:encoded><![CDATA[<div>I would like to announce that we have just released a new version of our Bisulfite-seq alignment and methylation calling tool Bismark. All associated files are available for free from <a href="http://www.bioinformatics.bbsrc.ac.uk/projects/" target="_blank">http://www.bioinformatics.bbsrc.ac.uk/projects/</a>.<br />
<br />
As the most noticeable difference, Bismark does now further subdivide non-CpG context into CHG and CHH context, which will be especially interesting for researchers working on plant systems. The former characters 'C/c' in the methylation call has been replaced by:  <br />
<br />
CHG-context: X / x (methylated / unmethylated)<br />
CHH-context: H / h (methylated / unmethylated)<br />
 <br />
 <br />
In addition, I noticed that due to recent changes in the Bowtie source code, Bismark was producing lots of warnings 'best-first memory chunk exhaustion...') which was also mentioned here on SEQanswers. As suggested by Ben Langmead, the best way to counteract this problem is to increase the memory size for each bowtie thread, or mute bowtie. Thus, Bismark will now understand the additional option '--chunkmbs &lt;int&gt;' to change the memory from 64 (default)to any integer (I found that 256 or even 512 got rid of nearly all warnings). These errors were especially frequent in --best mode or for paired-end alignments. Bismark will now also understand the '--quiet' option to suppress memory chunk exhaustion (and other) warnings.  <br />
<br />
<br />
Some other minor fixes include:<br />
<br />
- FastA files do no longer require the file ending &quot;.fa&quot;.<br />
<br />
- Fixed an issues so that Bismark will no longer tolerate chromosomes with<br />
same name when reading the genome into memory.  <br />
<br />
- Fixed an issue with paired-end alignment reports.<br />
<br />
- The methylation extractor will by default distinguish between cytosines in the three contexts CpG, CHG or CHH. If this is not needed, CHG and CHH context can be merged into 'non-CpG' context by specifying '--merge_non_CpG'.<br />
<br />
- Due to the fundamental changes in v0.2.0 (CHG and CHH context methylation calls) the methylation extractor will now require that the Bismark mapping result file was generated with the same version of Bismark.<br />
<br />
<br />
If you have any suggestion or comments I would like from you!<br />
<br />
Best wishes,<br />
<br />
Felix</div>

]]></content:encoded>
			<category domain="http://seqanswers.com/forums/forumdisplay.php?f=18">Bioinformatics</category>
			<dc:creator>fkrueger</dc:creator>
			<guid isPermaLink="true">http://seqanswers.com/forums/showthread.php?t=6731</guid>
		</item>
		<item>
			<title>Compile multi-individual SNP tables from bowtie/samtools mappings</title>
			<link>http://seqanswers.com/forums/showthread.php?t=6730&amp;goto=newpost</link>
			<pubDate>Tue, 07 Sep 2010 14:07:38 GMT</pubDate>
			<description>I am sure someone must have solved this before, and I wonder if there is not an obviously easy way to do this: 
 
I have GAIIx-RAD data of...</description>
			<content:encoded><![CDATA[<div>I am sure someone must have solved this before, and I wonder if there is not an obviously easy way to do this:<br />
<br />
I have GAIIx-RAD data of bowtie-reference-mapped individuals. The resulting individual SNP data (produced with samtools in the form of pileups) is easily merged in a single table, but one major problem remains:<br />
<br />
The samtools pileup only reports on SNPs which are different wrt the reference sequence, but not on those SNPs which happen to be identical to the reference (i.e. those which show differences in other individuals). <br />
<br />
Is there any easy way of discriminating the absence of data from DNA-sequence identity of any given SNP with the reference sequence? <br />
<br />
Any pointers most welcome!:)</div>

]]></content:encoded>
			<category domain="http://seqanswers.com/forums/forumdisplay.php?f=18">Bioinformatics</category>
			<dc:creator>KNS</dc:creator>
			<guid isPermaLink="true">http://seqanswers.com/forums/showthread.php?t=6730</guid>
		</item>
		<item>
			<title>sequence capture visaulization</title>
			<link>http://seqanswers.com/forums/showthread.php?t=6728&amp;goto=newpost</link>
			<pubDate>Tue, 07 Sep 2010 12:14:59 GMT</pubDate>
			<description>Hello all 
Does anybody know if there is a graphical solution to see the results of a sequence capture experiment. For example to see the targets and...</description>
			<content:encoded><![CDATA[<div>Hello all<br />
Does anybody know if there is a graphical solution to see the results of a sequence capture experiment. For example to see the targets and which regions were covered and to see the variations comparing to a reference file etc..<br />
Thank you!!!</div>

]]></content:encoded>
			<category domain="http://seqanswers.com/forums/forumdisplay.php?f=18">Bioinformatics</category>
			<dc:creator>litali</dc:creator>
			<guid isPermaLink="true">http://seqanswers.com/forums/showthread.php?t=6728</guid>
		</item>
		<item>
			<title>RNA expression with 454 data</title>
			<link>http://seqanswers.com/forums/showthread.php?t=6727&amp;goto=newpost</link>
			<pubDate>Tue, 07 Sep 2010 12:12:17 GMT</pubDate>
			<description>Is there any software to analyze 454 cDNA data to receive expression differences between samples?</description>
			<content:encoded><![CDATA[<div>Is there any software to analyze 454 cDNA data to receive expression differences between samples?</div>

]]></content:encoded>
			<category domain="http://seqanswers.com/forums/forumdisplay.php?f=18">Bioinformatics</category>
			<dc:creator>litali</dc:creator>
			<guid isPermaLink="true">http://seqanswers.com/forums/showthread.php?t=6727</guid>
		</item>
		<item>
			<title>hardware requirements to run blastX</title>
			<link>http://seqanswers.com/forums/showthread.php?t=6726&amp;goto=newpost</link>
			<pubDate>Tue, 07 Sep 2010 12:10:45 GMT</pubDate>
			<description>Hi all, 
Can anybody give me an estimation what are the hardware requirements to have BlastX locally installed run in a reasonable pace and work ok...</description>
			<content:encoded><![CDATA[<div>Hi all,<br />
Can anybody give me an estimation what are the hardware requirements to have BlastX locally installed run in a reasonable pace and work ok with 454 data?<br />
Many thanks!!!</div>

]]></content:encoded>
			<category domain="http://seqanswers.com/forums/forumdisplay.php?f=18">Bioinformatics</category>
			<dc:creator>litali</dc:creator>
			<guid isPermaLink="true">http://seqanswers.com/forums/showthread.php?t=6726</guid>
		</item>
		<item>
			<title>GO annotation</title>
			<link>http://seqanswers.com/forums/showthread.php?t=6723&amp;goto=newpost</link>
			<pubDate>Tue, 07 Sep 2010 10:14:29 GMT</pubDate>
			<description>Hi all! 
 
I have a problem with GO annotation of my BLAST results. So far I used Blast2GO, but for several reasons I am a bit unsatisfied with the...</description>
			<content:encoded><![CDATA[<div>Hi all!<br />
<br />
I have a problem with GO annotation of my BLAST results. So far I used Blast2GO, but for several reasons I am a bit unsatisfied with the programme.<br />
In addition I need a good export function because I want to further analyse the data in MS Excel.<br />
<br />
Can anyone name a sufficient programme which<br />
- annotates GO terms for .xml files or swissprot accesion IDs (BLAST mustn't be done there!)<br />
- possibility to only export level 2 GO terms<br />
- possibility to export GO terms in tab-delimited form or equal for further analysis in Excel (information about category MF, BP, CC should be maintained!)<br />
<br />
Thanks to everyone in advance!</div>

]]></content:encoded>
			<category domain="http://seqanswers.com/forums/forumdisplay.php?f=18">Bioinformatics</category>
			<dc:creator>Ramet</dc:creator>
			<guid isPermaLink="true">http://seqanswers.com/forums/showthread.php?t=6723</guid>
		</item>
	</channel>
</rss>
