SEQanswers

Go Back   SEQanswers > Jobs Forums > Academic/Non-Profit Jobs



Similar Threads
Thread Thread Starter Forum Replies Last Post
How do you download a FASTA sequence from NCBI Nucleotide onto a remote server? ehlin Bioinformatics 5 12-10-2018 10:34 AM
Tutor needed for concepts in gene assembly (gtalk, skype...) newkid Bioinformatics 0 10-24-2012 01:57 PM
Command line blast with remote option nupurgupta Bioinformatics 2 05-18-2012 07:47 AM
using blast+ for remote blasting rangel Bioinformatics 2 03-29-2012 02:30 PM
Stand Alone Blast Exclude without -remote tboothby Bioinformatics 1 02-04-2012 04:03 AM

Reply
 
Thread Tools
Old 10-24-2012, 07:55 PM   #1
newkid
Junior Member
 
Location: California

Join Date: Oct 2012
Posts: 9
Default Tutor wanted for remote sessions

Posted this in Bioinformatics too:

Hi everyone,

I was looking for a tutor that would be willing to sit down for a few one hour long sessions, once or twice a week. These sessions would be to explain and walk me through a few important concepts involved in genome assembly (de novo, and via reference). We would be working with publicly available data sets in a linux environment over gtalk/skype.

It would be ideal to have someone that has a considerable amount of industrial experience in bioinformatics with a broad understanding of computational biology.

I would be willing to pay a reasonable rate via paypal or any other agreed medium! I'm on Pacific standard time, and am available weekends or late evenings.

A few sample questions--if you can answer these off the top of your head you're in great shape:

1. What is a k-mer value?
2. How big and what are the file formats generated from a illumina hi-seq machine at 30x coverage? (for e. coli...)
3. How are data sets trimmed? (popular programs to do so)


Cheers!
newkid is offline   Reply With Quote
Old 10-24-2012, 11:04 PM   #2
upendra_35
Senior Member
 
Location: USA

Join Date: Apr 2010
Posts: 102
Default

Quote:
Originally Posted by newkid View Post
Posted this in Bioinformatics too:

Hi everyone,

I was looking for a tutor that would be willing to sit down for a few one hour long sessions, once or twice a week. These sessions would be to explain and walk me through a few important concepts involved in genome assembly (de novo, and via reference). We would be working with publicly available data sets in a linux environment over gtalk/skype.

It would be ideal to have someone that has a considerable amount of industrial experience in bioinformatics with a broad understanding of computational biology.

I would be willing to pay a reasonable rate via paypal or any other agreed medium! I'm on Pacific standard time, and am available weekends or late evenings.

A few sample questions--if you can answer these off the top of your head you're in great shape:

1. What is a k-mer value?
2. How big and what are the file formats generated from a illumina hi-seq machine at 30x coverage? (for e. coli...)
3. How are data sets trimmed? (popular programs to do so)


Cheers!
Hi,

I am from CA as well and though i am not from Industry i have fair bit of RNAseq experience. I have done lots and lots of denovo and reference based assemblies and after undergoing through lots of pains and thrills i am very comfortable dealing with any kind of denovo and RB stuff.

Regarding your questions:
1. K-mer is the seed/word size of length k observed more than once in a sequence.

2. Assuming you are sequencing your library with SE of 50 bases you only need 3 million reads to get a coverage of 30X. So that is 1/50 of the lane in hi-seq.

3. I assume you are asking about the ways to trim you reads. If so there a many number of ways to do so and one popular tool is FASTX tool kit.

Thanks
upendra_35 is offline   Reply With Quote
Old 10-25-2012, 08:44 AM   #3
Jan_R
Junior Member
 
Location: Seattle

Join Date: Jul 2011
Posts: 8
Default

Here are my 2 cents:

1.
... or in the dataset. When you can find exact same sequence very often in your raw data while you are checking its quality (e.g. with FastQC), it is most probably an artifact. Sequencing primers, adaptors. But it can also be rRNA. Blast such sequences and find out. If you do not like them, find ways to extract them.

2.
... Illumina files usualy come in the FASTQ format: http://en.wikipedia.org/wiki/FASTQ_format
The size of the files would some hundred MB

3.
... I also highly recommend the FASTX toolkit. Combining these tools is perfect to get your data in shape
Jan_R is offline   Reply With Quote
Old 10-26-2012, 06:49 AM   #4
flyboyleo
Junior Member
 
Location: Sydney, Australia

Join Date: Jun 2012
Posts: 6
Default

3. I thought 'Trim Galore' is better for integrated Cutadapt and FastQC to consistently apply quality and adapter trimming to FastQ files together
flyboyleo is offline   Reply With Quote
Old 10-26-2012, 06:12 PM   #5
Jan_R
Junior Member
 
Location: Seattle

Join Date: Jul 2011
Posts: 8
Default

Quote:
Originally Posted by flyboyleo View Post
3. I thought 'Trim Galore' is better for integrated Cutadapt and FastQC to consistently apply quality and adapter trimming to FastQ files together
Good that you mention cutadapt.You can use both
I used cutadapt in combination with the FASTX tools. The FASTX Barcode Splitter seems to be not so versatile.
Jan_R is offline   Reply With Quote
Old 10-28-2012, 06:06 PM   #6
newkid
Junior Member
 
Location: California

Join Date: Oct 2012
Posts: 9
Default

Position is still open

PM me, if it's of any interest.
newkid is offline   Reply With Quote
Old 10-30-2012, 07:11 AM   #7
SES
Senior Member
 
Location: Vancouver, BC

Join Date: Mar 2010
Posts: 275
Default

Quote:
Originally Posted by upendra_35 View Post
1. K-mer is the seed/word size of length k observed more than once in a sequence.
Just to be thorough, I'll add that a k-mer does not have to be observed more than once. Any k-mer that is observed exactly once is considered unique and the ratio of these to all other k-mers in a data set is a metric frequently used in comparative genomics.
SES is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:48 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO