Has anyone used any SNP analysis pipelines other than the standard variant pipeline? Any suggestions on good ones to try?
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Originally posted by cmm8cmm8 View PostHas anyone used any SNP analysis pipelines other than the standard variant pipeline? Any suggestions on good ones to try?
The Bioconductor project aims to develop and share open source software for precise and repeatable analysis of biological data. We foster an inclusive and collaborative community of developers and data scientists.
Here you will find all possible libraries for several platforms like Affy, Agilent, Illumina..
I hope this helps.
Comment
-
Originally posted by manoj.b View PostWe use most of the libraries from bioconductors.
The Bioconductor project aims to develop and share open source software for precise and repeatable analysis of biological data. We foster an inclusive and collaborative community of developers and data scientists.
Here you will find all possible libraries for several platforms like Affy, Agilent, Illumina..
I hope this helps.
Q: What do you consider as "standard"? Only from the vendors? How about MAQ?
I'm testing NextGENe currently, which is supposedly designed for SNP detection (well, really for mutation detection). Would love to hear what other people use.
Comment
-
For SOLiD data, we use BFAST (admittedly my own aligner) [https://secure.genome.ucla.edu/index.php/BFAST]. The output of that is converted to SAM format (for use with samtools) [http://samtools.sourceforge.net/].
We then use the MAQ consensus model to call SNPs using samtools, modifying the various parameters (train on known data) to get the correct TPR and FPR for calling hets.
Nils
Comment
-
Nilshomer,
I recognize that your data is SOLID, but I was wondering about your method for concensus calling in which you "train on known data" to find the best parameter settings.
I, too, am interested in doing such a thing. I have a 1M SNP Illumina array and Next-Gen data from the Illumina GA2 on the exome. What type of data did you train on?
Which parameters did you find needed the most tweaking?
Did you also find that the number of variants called by MAQ (or Samtools, in your case) was very high? I get >180,000 variants in the cns.filter.snp file when using the parameters from easyrun. This seems like way too many, but I'm having difficulty distinguishing the real things from the false positives.
Looking forward to hearing your input...
Comment
-
We are novice bioinformtacists so use CLC Bio's Genomic Workbench. The DIP (deletion-insertion polymorphism) algorithm works well. The SNP algorithm definitely detects known SNPs and we are optimizing the settings for best sensitivity and specificity. So far if we maximize specificity by looking at the X and Y chromosomes where SNPs should obviously be homozygous for male DNA samples, it reduces sensitivity and we miss too many known SNPs. Relaxing the criteria gives us better sensitivity but we get too many false positives.
Comment
-
Originally posted by erichpowell View PostNilshomer,
I recognize that your data is SOLID, but I was wondering about your method for concensus calling in which you "train on known data" to find the best parameter settings.
I, too, am interested in doing such a thing. I have a 1M SNP Illumina array and Next-Gen data from the Illumina GA2 on the exome. What type of data did you train on?
Which parameters did you find needed the most tweaking?
Did you also find that the number of variants called by MAQ (or Samtools, in your case) was very high? I get >180,000 variants in the cns.filter.snp file when using the parameters from easyrun. This seems like way too many, but I'm having difficulty distinguishing the real things from the false positives.
Looking forward to hearing your input...
Comment
-
Hi,
Without using LifeTech's BioScope/LifeScope, I think the following pipeline can be applied to SOLiD data for SNP/indel detection.
1) *.csfasta+*.qual / *.XSQ -> SAM/BAM
BFAST, BWA, or NovoalignCS
2) SAM/BAM -> SNP/indel detection
SAM tools or GATK (more accurate)
3) Annotation
GATK or ANNOVAR
I think SAM tools and GATK do not use color-space information to detect SNPs/indels. That is one of the advantage of BioScope/LifeScope.Last edited by HiroMishima; 10-31-2011, 04:44 PM.
Comment
-
Originally posted by HiroMishima View PostHi,
Without using LifeTech's BioScope/LifeScope, I think the following pipeline can be applied to SOLiD data for SNP/indel detection.
1) *.csfasta+*.qual / *.XSQ -> SAM/BAM
BFAST, BWA, or NovoalignCS
2) SAM/BAM -> SNP/indel detection
SAM tools or GATK (more accurate)
3) Annotation
GATK or ANNOVAR
I think SAM tools and GATK do not use color-space information to detect SNPs/indels. That is one of the advantage of BioScope/LifeScope.
Comment
Latest Articles
Collapse
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
-
by seqadmin
Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...-
Channel: Articles
03-22-2024, 06:39 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
31 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
32 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
||
Started by seqadmin, 04-10-2024, 09:21 AM
|
0 responses
28 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 09:21 AM
|
||
Started by seqadmin, 04-04-2024, 09:00 AM
|
0 responses
53 views
0 likes
|
Last Post
by seqadmin
04-04-2024, 09:00 AM
|
Comment