SEQanswers

Go Back   SEQanswers > Search Forums


Showing results 1 to 25 of 26
Search took 0.02 seconds.
Search: Posts Made By: brofallon
Forum: Bioinformatics 04-26-2013, 08:31 AM
Replies: 4
Views: 1,657
Posted By brofallon
Huh... that -u option on samtools view does seem...

Huh... that -u option on samtools view does seem to speed things up a bit. Stream sorter is still a bit faster for two reasons. First, it doesn't depend on piping anything though samtools view to...
Forum: Bioinformatics 04-25-2013, 01:46 PM
Replies: 4
Views: 1,657
Posted By brofallon
A faster read sorter

Hello,
I recently developed a fast stream-based .SAM sorter, streamsorter (https://github.com/brendanofallon/StreamSorter). It reads .SAM output from a stream (for instance, produced by bwa or...
Forum: Bioinformatics 12-31-2012, 01:12 PM
Replies: 15
Views: 7,100
Posted By brofallon
I believe the two most commonly used tools are...

I believe the two most commonly used tools are annovar and SNPEff. Annovar handles many types of annotations and is built for filtering. SNPEff produces some nice html files for your web-viewing...
Forum: Bioinformatics 12-21-2012, 08:46 AM
Replies: 8
Views: 5,346
Posted By brofallon
Thanks again, ekg. The funkiness in the samtools...

Thanks again, ekg. The funkiness in the samtools ROC curve was due to indels that I thought I had removed, but actually hadn't. It's corrected now and overall the curve seems very similar to...
Forum: Bioinformatics 12-20-2012, 09:50 AM
Replies: 10
Views: 1,813
Posted By brofallon
For our part, if we find a questionable variant...

For our part, if we find a questionable variant in NGS data, and then Sanger sequence the area and find that the variant is not there, we typically assume the variant is a false positive and go no...
Forum: Bioinformatics 12-19-2012, 03:53 PM
Replies: 10
Views: 4,197
Posted By brofallon
Keep in mind that it's unlikely that there's is a...

Keep in mind that it's unlikely that there's is a phylogenetic tree that underlies the data. Recombinations are likely to make the trees differ from SNP to SNP, so taking a bunch of SNPs and forcing...
Forum: Bioinformatics 12-19-2012, 01:29 PM
Replies: 8
Views: 5,346
Posted By brofallon
Hi there, I haven't tried to do anything with...

Hi there,
I haven't tried to do anything with RNA-seq yet, so I'm not sure how well it would work. I think you're right that training on a RNA-seq specific dataset would probably be the best idea....
Forum: Bioinformatics 12-19-2012, 07:12 AM
Replies: 2
Views: 982
Posted By brofallon
Not sure what you mean by "rate of SNPs" - number...

Not sure what you mean by "rate of SNPs" - number of SNPs in a region of interest? Rate at which they accumulate over time?
dbSNP is probably not your friend in either case (lots of false...
Forum: Bioinformatics 12-19-2012, 06:56 AM
Replies: 8
Views: 5,346
Posted By brofallon
Thanks - indeed the labels on that figure should...

Thanks - indeed the labels on that figure should be switched (I always reverse 'em...), and I'll be sure to cite FreeBayes.
I should have mentioned there's a default 'model' file on the github...
Forum: Bioinformatics 12-19-2012, 06:12 AM
Replies: 8
Views: 5,346
Posted By brofallon
SNPSVM has fewer false positives per true...

SNPSVM has fewer false positives per true positive call than other tools - about 50% fewer for the data sets I've been examining (mostly NA12878). For pretty much any level of sensitivity it has...
Forum: Bioinformatics 12-18-2012, 10:36 AM
Replies: 10
Views: 1,813
Posted By brofallon
Sanger

We Sanger suspected variants pretty frequently here and we don't get too many false negatives. Ion Torrent is another next-gen method and is both expensive and probably not much more reliable than...
Forum: Bioinformatics 12-18-2012, 10:25 AM
Replies: 8
Views: 5,346
Posted By brofallon
SNPSVM : An accurate, machine-learning based variant caller

Howdy all,
I've been working on a support-vector machine based variant caller, and it's finally at the point where others may find it useful. It calls variants from .BAM or .SAM files in a manner...
Forum: Bioinformatics 11-26-2012, 05:21 PM
Replies: 3
Views: 2,114
Posted By brofallon
Take at look at the last column. The bit before...

Take at look at the last column. The bit before the first colon reads 0/1 or 1/1. 0/1 implies that heterozygous is the most likely genotype at the position, 1/1 means that homozygous non-reference is...
Forum: Bioinformatics 11-09-2012, 07:50 AM
Replies: 1
Views: 2,254
Posted By brofallon
NA12878 truth sets?

Howdy all, I'm searching for high-quality sets of variant calls for NA12878 to be used as a gold standard. Anyone have ideas of where to look for a super high quality NA12878 variant call set? Anyone...
Forum: Bioinformatics 08-06-2012, 10:21 AM
Replies: 4
Views: 1,868
Posted By brofallon
Oops - indeed that's 62 alternate and 6...

Oops - indeed that's 62 alternate and 6 reference, not the other way, my mistake.
Nonetheless, FS is still very high - that's a phred-scaled p-value so it certainly looks significant. But FS...
Forum: Bioinformatics 08-06-2012, 05:30 AM
Replies: 4
Views: 1,868
Posted By brofallon
Low quality?

Well, this doesn't really answer your question, but there's a decent chance that variant isn't real. It's very strand-biased (FS is over 30), and the variant allele depth is very low (only 6 of 68...
Forum: Bioinformatics 07-03-2012, 01:50 PM
Replies: 7
Views: 4,919
Posted By brofallon
I don't really think its possible to convert a...

I don't really think its possible to convert a fasta file directly into BAM. BAM files are meant to store many short reads with associated quality scores, but fasta is just a listing of a single...
Forum: Bioinformatics 03-22-2012, 02:52 PM
Replies: 5
Views: 3,741
Posted By brofallon
I believe by default that reads with zero mapping...

I believe by default that reads with zero mapping quality are ignored by the UnifiedGenotyper. If you're curious, you could filter out such reads using the PrintReads tool and the MappingQualityZero...
Forum: Bioinformatics 03-21-2012, 06:21 AM
Replies: 10
Views: 4,197
Posted By brofallon
If you concat the SNPs, and therefore ignore all...

If you concat the SNPs, and therefore ignore all invariant sites, you'll probably get approximately the correct tree topology, but branch lengths that are much too long. Some programs may break under...
Forum: Bioinformatics 03-20-2012, 06:41 AM
Replies: 0
Views: 1,519
Posted By brofallon
BAM file cleaning

Howdy,
I've recently noticed that I seem to getting a fair number of false positive SNP calls due to do reads with unmapped mates that appear to be duplicates (the reads all start and end at the...
Forum: Bioinformatics 03-20-2012, 06:34 AM
Replies: 10
Views: 4,197
Posted By brofallon
Most phylogeny estimation tools (phylip, phyml,...

Most phylogeny estimation tools (phylip, phyml, paup*, MrBayes, *BEAST etc) require their input to be in fasta or phylip format. SNPs alone are tricky for those tools since there's a lot of ignored...
Forum: Bioinformatics 02-24-2012, 11:05 AM
Replies: 5
Views: 2,287
Posted By brofallon
As I mentioned in the OP, it looks to be an...

As I mentioned in the OP, it looks to be an off-by-9997 issue, but only for the first 5.64M bases. After that, a long string of N's appears in the reference fasta, and after that shifting by 9997...
Forum: Bioinformatics 02-24-2012, 08:04 AM
Replies: 5
Views: 2,287
Posted By brofallon
Position 4 was an arbitrary example... Using the...

Position 4 was an arbitrary example... Using the data you listed above, the vcf indicates that the reference should be '"T" at position 60523, but this doesn't seem to be the case with the UCSC...
Forum: Bioinformatics 02-24-2012, 06:01 AM
Replies: 5
Views: 2,287
Posted By brofallon
Reference that matches vcf?

I recently downloaded the reference (hg19) chromosome 10 fasta file from UCSC, and naively assumed that this reference would match the reference alleles in a vcf from 1000 genomes. For instance, if...
Forum: Bioinformatics 12-06-2011, 11:49 AM
Replies: 2
Views: 8,341
Posted By brofallon
Create vcf tribble index?

Hi, I'm using the GaTK to call variants, and afterward I somehow managed to delete some of the .vcf.idx index files. Without those files the GaTK can't read the vcf files. Is there a way to (re-)...
Showing results 1 to 25 of 26

 


All times are GMT -8. The time now is 12:50 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO