SEQanswers

Go Back   SEQanswers > Search Forums


Showing results 1 to 25 of 156
Search took 0.01 seconds.
Search: Posts Made By: arvid
Forum: Bioinformatics 06-13-2012, 11:00 PM
Replies: 4
Views: 2,950
Posted By arvid
There is one caveat with the original script line...

There is one caveat with the original script line I posted; if there is no corresponding entry in the dictionary file for a given "Name", it will be replaced by an empty string - which might not be...
Forum: Bioinformatics 06-13-2012, 10:54 PM
Replies: 4
Views: 2,950
Posted By arvid
Some versions of sed/awk won't interpret \t as a...

Some versions of sed/awk won't interpret \t as a tab character, that's why I prefer to use shell string expansion in scripts to make them work - $'\t' would be expanded to a tab character by the...
Forum: Bioinformatics 06-13-2012, 12:50 AM
Replies: 4
Views: 2,950
Posted By arvid
A simple and relatively fast script would be the...

A simple and relatively fast script would be the following, assuming a sorted dictionary file (dict.txt) containing the search/replace pairs AND that the "Name" GFF field is the last on each line to...
Forum: Bioinformatics 06-07-2012, 06:20 AM
Replies: 7
Views: 1,734
Posted By arvid
IMHO you do neither 1 nor 2, but instead display...

IMHO you do neither 1 nor 2, but instead display the BAM in GBrowse directly and/or use a bigWig track to display coverage when zoomed out. Importing GFF will make the database huge and probably lag...
Forum: Bioinformatics 06-07-2012, 01:20 AM
Replies: 36
Views: 26,837
Posted By arvid
I just noticed that pysam.Samfile.pileup() only...

I just noticed that pysam.Samfile.pileup() only gives me primary read alignments that overlap a given position, no secondary alignments. Is there a reason for this behaviour?

I do get all the...
Forum: Bioinformatics 06-06-2012, 11:24 PM
Replies: 42
Views: 11,531
Posted By arvid
While I do care about performance of the software...

While I do care about performance of the software I use, it won't help me at all if my alignment is finished in 5 minutes instead of an hour, if it isn't as accurate and sensitive.

So I don't...
Forum: Bioinformatics 06-06-2012, 10:48 PM
Replies: 7
Views: 1,734
Posted By arvid
Gene predictors and older short read aligners...

Gene predictors and older short read aligners (pre-SAM). Many tools made within the framework of GMOD use GFF(2/3) to exchange functional information about genomes. You'll find more information on...
Forum: Bioinformatics 06-06-2012, 03:12 AM
Replies: 30
Views: 8,545
Posted By arvid
You wouldn't need to spend that much more money...

You wouldn't need to spend that much more money to get a decent server. We just got a 64 core server (4xAMD 6276) with 512 GB ECC DDR3-1600 RAM (32x16 GB) for ~14 000 . With 256 GB RAM (32x8 GB), it...
Forum: Bioinformatics 06-06-2012, 03:01 AM
Replies: 30
Views: 8,545
Posted By arvid
Why not buy cloud computing time or run it on...

Why not buy cloud computing time or run it on XSEDE Blacklight or something similar if this is a one-off thing? Or ask someone in your neighbourhood with a beefy server to collaborate on that...
Forum: Bioinformatics 06-05-2012, 04:27 AM
Replies: 7
Views: 3,086
Posted By arvid
Doesn't seem like it, then they'd be skewed but...

Doesn't seem like it, then they'd be skewed but not qualitatively different like they are. Look at the quality manually for a few reads - they should all have a bad quality at pos 20 if FastQC is...
Forum: Bioinformatics 06-05-2012, 12:52 AM
Replies: 17
Views: 9,198
Posted By arvid
For Trinity, you'd want to combine that into one...

For Trinity, you'd want to combine that into one file, it should be able to recognize the pairs on its own (might have changed recently though, as a paired end mapping step was introduced which might...
Forum: Bioinformatics 06-05-2012, 12:20 AM
Replies: 17
Views: 9,198
Posted By arvid
I'd second the suggestion to try Trinity on that...

I'd second the suggestion to try Trinity on that dataset. You could reduce your dataset with diginorm, if necessary, though 81 Mio reads (pairs?) sounds reasonable to tackle with a ~64 GB server -...
Forum: Bioinformatics 06-04-2012, 03:53 AM
Replies: 14
Views: 1,929
Posted By arvid
Depending on the application, it is often...

Depending on the application, it is often interesting in RNA-Seq to be able to distinguish if a read comes from a sense (i.e. protein coding) or antisense (possibly regulatory) transcript....
Forum: Bioinformatics 06-04-2012, 02:39 AM
Replies: 1
Views: 1,593
Posted By arvid
If you are able to do a bit of Python coding on...

If you are able to do a bit of Python coding on your own you could use the bcbio BioPython module for GFF parsing/writing from the following page; there is a description on how to convert GenBank...
Forum: Bioinformatics 06-01-2012, 05:05 AM
Replies: 2
Views: 1,789
Posted By arvid
Well, actually they are extending each other....

Well, actually they are extending each other. What you are referring to isn't actually new databases, they are new builds of a database.
A new build consists of the older known entries plus new...
Forum: Bioinformatics 05-31-2012, 11:51 PM
Replies: 7
Views: 1,734
Posted By arvid
I'm not aware of a direct converter, but...

I'm not aware of a direct converter, but theoretically you could feed a BAM into BEDTools BamToBed and push the BED through GenomeTools bed_to_gff3.
I would probably write a converter script myself...
Forum: Bioinformatics 05-31-2012, 11:33 PM
Replies: 4
Views: 1,573
Posted By arvid
I'd agree with GenoMax, if you don't know for...

I'd agree with GenoMax, if you don't know for sure now that a major software you will be running can use the GPU, use the money for more RAM, cores, SSDs or storage instead.
Forum: Bioinformatics 05-29-2012, 06:30 AM
Replies: 2
Views: 2,536
Posted By arvid
These won't do soft clipping, however; you'd have...

These won't do soft clipping, however; you'd have to parse their output and do some wild guesswork on the clippings. I would rather take the source and make a fork instead which outputs soft-clipped...
Forum: Bioinformatics 05-29-2012, 06:22 AM
Replies: 4
Views: 2,124
Posted By arvid
You can checking mapping in a BAM file with...

You can checking mapping in a BAM file with "samtools idxstats file.bam"; that will tell you the number of reads mapping to each chromosome.
If the mapping looks fine, try breaking up all the...
Forum: Bioinformatics 05-29-2012, 04:33 AM
Replies: 1
Views: 1,635
Posted By arvid
I'm not sure what you mean by average, if you...

I'm not sure what you mean by average, if you talk about each position; do you mean each precise position or some kind of window?

Have a look at "bedtools coverage" and "bedtools genomecov"; I...
Forum: Bioinformatics 05-25-2012, 04:01 AM
Replies: 3
Views: 3,223
Posted By arvid
Yes, basically. I'd really replace the CIGAR with...

Yes, basically. I'd really replace the CIGAR with a * though, to make the SAM/BAM file standardized. You might want to check how to decide on which side of the read sequence in the BAM is 5'/3' (in...
Forum: Bioinformatics 05-25-2012, 03:37 AM
Replies: 3
Views: 3,223
Posted By arvid
Define "retain mapping information"; do you need...

Define "retain mapping information"; do you need the information provided by the CIGARs? If you don't, such trimming is straightforward with a simple script chopping off the bases and qualities in...
Forum: Bioinformatics 05-25-2012, 03:22 AM
Replies: 9
Views: 2,547
Posted By arvid
Right, but then why use a read simulator? To me...

Right, but then why use a read simulator? To me it sounds more feasible to dissect the problems into units that can be more easily solved separately:

1. check k-mer distributions in the genome of...
Forum: Bioinformatics 05-25-2012, 03:09 AM
Replies: 1
Views: 2,076
Posted By arvid
It doesn't make sense to make a GTF file for a...

It doesn't make sense to make a GTF file for a transcript fasta file, like yours. You use a GTF file to describe the transcript annotations for a genome fasta file.

I'm not working on human, so...
Forum: Bioinformatics 05-25-2012, 02:58 AM
Replies: 9
Views: 2,547
Posted By arvid
:confused: If you are simulating reads, why...

:confused: If you are simulating reads, why without errors? If you are developing an algorithm or heuristic that should deal with real reads, I don't see the point... because they will behave quite...
Showing results 1 to 25 of 156

 


All times are GMT -8. The time now is 12:03 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO