Forum: Bioinformatics
04-24-2018, 08:42 AM
|
Replies: 5
Views: 3,894
|
Forum: Bioinformatics
04-18-2018, 09:46 AM
|
Replies: 703
Views: 271,915
The coordinate system varies across platforms;...
The coordinate system varies across platforms; the physical distance of 1 pixel is much larger on HiSeq 2500 than HiSeq 3000/4000m for example.
NovaSeq has unique problems, though. The distance...
|
Forum: Bioinformatics
04-17-2018, 04:38 PM
|
Replies: 703
Views: 271,915
Not exactly. reformat.sh has a "mappedonly"...
Not exactly. reformat.sh has a "mappedonly" flag, but that would only keep the mapped reads. However, you can use requiredbits and filterbits in 2 passes, like this:
reformat.sh in=data.sam...
|
Forum: Bioinformatics
04-17-2018, 04:31 PM
|
Replies: 703
Views: 271,915
|
Forum: Bioinformatics
04-11-2018, 03:45 PM
|
Replies: 132
Views: 88,840
@chloe - It's normally simplest and most...
@chloe - It's normally simplest and most effective to do QC first on the raw data, then anything else (such as merging) later.
@silask - they way you are doing it is currently the most effective...
|
Forum: Bioinformatics
04-04-2018, 09:10 AM
|
Replies: 5
Views: 3,894
|
Forum: Bioinformatics
03-21-2018, 03:24 PM
|
Replies: 703
Views: 271,915
The problem here is that minimap uses old-style...
The problem here is that minimap uses old-style cigar strings (M symbol instead of = and X) and also does not produce MD tags. I've added the ability to handle reads in that situation and it will be...
|
Forum: Bioinformatics
03-21-2018, 03:21 PM
|
Replies: 244
Views: 139,963
While BBMap is not originally designed for this...
While BBMap is not originally designed for this purpose; I made a version that does a much better job at finding all mappings above some identity threshold, bbmapskimmer.sh. The usage is the same as...
|
Forum: Bioinformatics
03-21-2018, 03:18 PM
|
Replies: 244
Views: 139,963
BBMerge might be able to help in this case, if...
BBMerge might be able to help in this case, if you have paired reads with a sufficient number of short inserts. You can run it like this:
bbmerge.sh in1=read1.fq in2=read2.fq outa=adapters.fa
...
|
Forum: Bioinformatics
03-21-2018, 03:13 PM
|
Replies: 703
Views: 271,915
|
Forum: Bioinformatics
03-20-2018, 09:07 AM
|
Replies: 41
Views: 44,045
Reformat won't do that, but you can use...
Reformat won't do that, but you can use partition.sh:
partition.sh in=X.fa out=X%.fa ways=10
That will produce 10 output files with an equal number of sequences and no duplication.
|
Forum: Bioinformatics
01-04-2018, 10:27 AM
|
Replies: 12
Views: 5,640
I concur; 17 is really too short for this...
I concur; 17 is really too short for this purpose. When trying to estimate genome size, it's important for most of the kmers to be unique (aside from long perfect repeats); so, kmer lengths greater...
|
Forum: Bioinformatics
10-13-2017, 10:14 AM
|
Replies: 104
Views: 51,311
Hi Gopo,
I don't particularly recommend...
Hi Gopo,
I don't particularly recommend Tadpole for diploid (or higher) genomes, as it has absolutely no capability of dealing with heterozygous sites. However, it's really fast, so even with a...
|
Forum: Bioinformatics
10-11-2017, 05:52 PM
|
Replies: 132
Views: 88,840
As GenoMax says, trimming to Q30 is not...
As GenoMax says, trimming to Q30 is not beneficial before merging reads. BBMerge has some internal quality-trimming options, so it can try to merge, then quality-trim if it is unsuccessful, then try...
|
Forum: Bioinformatics
10-11-2017, 01:43 PM
|
Replies: 64
Views: 51,923
|
Forum: Bioinformatics
10-11-2017, 12:48 PM
|
Replies: 132
Views: 88,840
Hi Ashu,
"Ambiguous" means there are...
Hi Ashu,
"Ambiguous" means there are multiple possible overlaps. For example, if read 1 and read 2 both end with "ACACACACACACACACACACAC", there are lots of possible overlap frames, none of which...
|
Forum: Illumina/Solexa
10-11-2017, 12:29 PM
|
Replies: 26
Views: 13,500
I have not looked into that yet. Actually, I...
I have not looked into that yet. Actually, I don't even know if we are spiking PhiX into our Novaseq runs, but that rate is worth examining, after I find out whether there is actually any PhiX...
|
Forum: Bioinformatics
10-11-2017, 12:24 PM
|
Replies: 1
Views: 2,202
I downloaded NA12878 from NIST, and they also...
I downloaded NA12878 from NIST, and they also have validated sets of small variations, but I didn't really find them all that useful. If anyone has validated CNV sets for those it would be NIST. ...
|
Forum: Illumina/Solexa
10-10-2017, 01:07 AM
|
Replies: 26
Views: 13,500
It only works for applications that are not...
It only works for applications that are not sensitive to crosstalk. Personally, I would never multiplex samples of the same genus on a NovaSeq unless all libraries had dual unique barcodes. The...
|
Forum: Illumina/Solexa
10-09-2017, 05:59 PM
|
Replies: 26
Views: 13,500
|
Forum: Illumina/Solexa
10-09-2017, 02:06 PM
|
Replies: 26
Views: 13,500
|
Forum: Bioinformatics
10-09-2017, 01:57 PM
|
Replies: 703
Views: 271,915
Hi Gopo,
Yes, I will add that (as an...
Hi Gopo,
Yes, I will add that (as an option). Is that common practice in other variant-callers? Note that callvariants.sh does currently have a "PF" (pass filter) field per sample, but I want to...
|
Forum: Bioinformatics
10-05-2017, 12:35 PM
|
Replies: 703
Views: 271,915
|
Forum: Illumina/Solexa
10-05-2017, 12:24 PM
|
Replies: 26
Views: 13,500
|
Forum: General
10-05-2017, 12:19 PM
|
Replies: 15
Views: 6,624
RAM is often the limiting factor in...
RAM is often the limiting factor in bioinformatics computing. I would not recommend buying a computer that you plan to use for bioinformatics with only 16 GB RAM unless it will be dedicated to some...
|