Forum: Bioinformatics
04-24-2018, 09:42 AM
|
Replies: 5
Views: 2,879
|
Forum: Bioinformatics
04-18-2018, 10:46 AM
|
Replies: 679
Views: 215,032
The coordinate system varies across platforms;...
The coordinate system varies across platforms; the physical distance of 1 pixel is much larger on HiSeq 2500 than HiSeq 3000/4000m for example.
NovaSeq has unique problems, though. The distance...
|
Forum: Bioinformatics
04-17-2018, 05:38 PM
|
Replies: 679
Views: 215,032
Not exactly. reformat.sh has a "mappedonly"...
Not exactly. reformat.sh has a "mappedonly" flag, but that would only keep the mapped reads. However, you can use requiredbits and filterbits in 2 passes, like this:
reformat.sh in=data.sam...
|
Forum: Bioinformatics
04-17-2018, 05:31 PM
|
Replies: 679
Views: 215,032
|
Forum: Bioinformatics
04-11-2018, 04:45 PM
|
Replies: 132
Views: 75,754
@chloe - It's normally simplest and most...
@chloe - It's normally simplest and most effective to do QC first on the raw data, then anything else (such as merging) later.
@silask - they way you are doing it is currently the most effective...
|
Forum: Bioinformatics
04-04-2018, 10:10 AM
|
Replies: 5
Views: 2,879
|
Forum: Bioinformatics
03-21-2018, 04:24 PM
|
Replies: 679
Views: 215,032
The problem here is that minimap uses old-style...
The problem here is that minimap uses old-style cigar strings (M symbol instead of = and X) and also does not produce MD tags. I've added the ability to handle reads in that situation and it will be...
|
Forum: Bioinformatics
03-21-2018, 04:21 PM
|
Replies: 244
Views: 112,980
While BBMap is not originally designed for this...
While BBMap is not originally designed for this purpose; I made a version that does a much better job at finding all mappings above some identity threshold, bbmapskimmer.sh. The usage is the same as...
|
Forum: Bioinformatics
03-21-2018, 04:18 PM
|
Replies: 244
Views: 112,980
BBMerge might be able to help in this case, if...
BBMerge might be able to help in this case, if you have paired reads with a sufficient number of short inserts. You can run it like this:
bbmerge.sh in1=read1.fq in2=read2.fq outa=adapters.fa
...
|
Forum: Bioinformatics
03-21-2018, 04:13 PM
|
Replies: 679
Views: 215,032
|
Forum: Bioinformatics
03-20-2018, 10:07 AM
|
Replies: 39
Views: 33,185
Reformat won't do that, but you can use...
Reformat won't do that, but you can use partition.sh:
partition.sh in=X.fa out=X%.fa ways=10
That will produce 10 output files with an equal number of sequences and no duplication.
|
Forum: Bioinformatics
01-04-2018, 11:27 AM
|
Replies: 12
Views: 4,536
I concur; 17 is really too short for this...
I concur; 17 is really too short for this purpose. When trying to estimate genome size, it's important for most of the kmers to be unique (aside from long perfect repeats); so, kmer lengths greater...
|
Forum: Bioinformatics
10-13-2017, 11:14 AM
|
Replies: 104
Views: 41,680
Hi Gopo,
I don't particularly recommend...
Hi Gopo,
I don't particularly recommend Tadpole for diploid (or higher) genomes, as it has absolutely no capability of dealing with heterozygous sites. However, it's really fast, so even with a...
|
Forum: Bioinformatics
10-11-2017, 06:52 PM
|
Replies: 132
Views: 75,754
As GenoMax says, trimming to Q30 is not...
As GenoMax says, trimming to Q30 is not beneficial before merging reads. BBMerge has some internal quality-trimming options, so it can try to merge, then quality-trim if it is unsuccessful, then try...
|
Forum: Bioinformatics
10-11-2017, 02:43 PM
|
Replies: 64
Views: 39,998
|
Forum: Bioinformatics
10-11-2017, 01:48 PM
|
Replies: 132
Views: 75,754
Hi Ashu,
"Ambiguous" means there are...
Hi Ashu,
"Ambiguous" means there are multiple possible overlaps. For example, if read 1 and read 2 both end with "ACACACACACACACACACACAC", there are lots of possible overlap frames, none of which...
|
Forum: Illumina/Solexa
10-11-2017, 01:29 PM
|
Replies: 26
Views: 11,694
I have not looked into that yet. Actually, I...
I have not looked into that yet. Actually, I don't even know if we are spiking PhiX into our Novaseq runs, but that rate is worth examining, after I find out whether there is actually any PhiX...
|
Forum: Bioinformatics
10-11-2017, 01:24 PM
|
Replies: 2
Views: 1,340
I downloaded NA12878 from NIST, and they also...
I downloaded NA12878 from NIST, and they also have validated sets of small variations, but I didn't really find them all that useful. If anyone has validated CNV sets for those it would be NIST. ...
|
Forum: Illumina/Solexa
10-10-2017, 02:07 AM
|
Replies: 26
Views: 11,694
It only works for applications that are not...
It only works for applications that are not sensitive to crosstalk. Personally, I would never multiplex samples of the same genus on a NovaSeq unless all libraries had dual unique barcodes. The...
|
Forum: Illumina/Solexa
10-09-2017, 06:59 PM
|
Replies: 26
Views: 11,694
|
Forum: Illumina/Solexa
10-09-2017, 03:06 PM
|
Replies: 26
Views: 11,694
|
Forum: Bioinformatics
10-09-2017, 02:57 PM
|
Replies: 679
Views: 215,032
Hi Gopo,
Yes, I will add that (as an...
Hi Gopo,
Yes, I will add that (as an option). Is that common practice in other variant-callers? Note that callvariants.sh does currently have a "PF" (pass filter) field per sample, but I want to...
|
Forum: Bioinformatics
10-05-2017, 01:35 PM
|
Replies: 679
Views: 215,032
|
Forum: Illumina/Solexa
10-05-2017, 01:24 PM
|
Replies: 26
Views: 11,694
|
Forum: General
10-05-2017, 01:19 PM
|
Replies: 15
Views: 5,679
RAM is often the limiting factor in...
RAM is often the limiting factor in bioinformatics computing. I would not recommend buying a computer that you plan to use for bioinformatics with only 16 GB RAM unless it will be dedicated to some...
|