SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Recommended Windows based freeware for sequence aligment and variant calling jcgrant31 Bioinformatics 12 04-22-2012 10:37 AM
Create one sequence based on overlapping primer sequences in amplicon ketan_bnf Bioinformatics 2 09-15-2011 12:33 AM
Contrail - a hadoop-based de novo sequence assembler samanta General 0 09-08-2011 11:16 AM
fractal and entropy based sequence visualization krawitz Bioinformatics 5 03-11-2010 07:57 AM
PubMed: Transcriptome analysis for C. elegans based on novel expressed sequence tags Newsbot! Literature Watch 0 07-10-2008 08:27 AM

Reply
 
Thread Tools
Old 11-15-2011, 06:35 AM   #1
ardmore
Member
 
Location: USA

Join Date: Jun 2011
Posts: 51
Default fasta sequence: 0 based or 1 based index

When we say the position of a sequence, is zero based or one based?
Because I want to extract a sub sequence from a long one.
Code:
AGCTTT
012345
OR
Code:
AGCTTT
123456
Thanks.
ardmore is offline   Reply With Quote
Old 11-15-2011, 07:05 AM   #2
ffinkernagel
Senior Member
 
Location: Marburg, Germany

Join Date: Oct 2009
Posts: 110
Default

That is undefined if you don't specify which system (or language) you're working with.
ffinkernagel is offline   Reply With Quote
Old 11-15-2011, 07:17 AM   #3
ardmore
Member
 
Location: USA

Join Date: Jun 2011
Posts: 51
Default

Okay. If the sequence is given and a position 63150935 is given as well. I want to get 1000kb size around this point by C#.
Then
Code:
string trunk = sequence.Substring(63150935-500000,1000000);
or
Code:
string trunk = sequence.Substring(63150934-500000,1000000);
Which one is correct?

Last edited by ardmore; 11-15-2011 at 07:26 AM.
ardmore is offline   Reply With Quote
Old 11-15-2011, 07:39 AM   #4
ffinkernagel
Senior Member
 
Location: Marburg, Germany

Join Date: Oct 2009
Posts: 110
Default

In c# the indices are 0 based, so the first one would be apppropriate if your position is also defined as 0 based.

If it was 1 based (for example, if it comes from Ensembl), you'll need to do the second one though.
ffinkernagel is offline   Reply With Quote
Old 11-15-2011, 07:44 AM   #5
ardmore
Member
 
Location: USA

Join Date: Jun 2011
Posts: 51
Default

My question is not for C#. I meant that I am not sure whether the sequence is defined as 0 based or not. The sequence is a fasta file or extracted from a genome.
ardmore is offline   Reply With Quote
Old 11-15-2011, 07:52 AM   #6
ffinkernagel
Senior Member
 
Location: Marburg, Germany

Join Date: Oct 2009
Posts: 110
Default

The sequence is not your issue. A sequence itself is not '0 based', it's just a list of characters.
Where does your position 63150935 come from?
ffinkernagel is offline   Reply With Quote
Old 11-15-2011, 08:21 AM   #7
ardmore
Member
 
Location: USA

Join Date: Jun 2011
Posts: 51
Default

It is from a bam file output. If we define a region such as chr22:10000-20000.
And we get the consensus sequence, we only interest one small region around a specific position.
How to?
ardmore is offline   Reply With Quote
Old 11-15-2011, 08:37 AM   #8
gringer
David Eccles (gringer)
 
Location: Wellington, New Zealand

Join Date: May 2011
Posts: 838
Default

If it's from a BAM/SAM file, then look at the BAM/SAM specification:

http://samtools.sourceforge.net/SAM1.pdf

For example, the fourth field of SAM files is 1-based:
Quote:
POS: 1-based leftmost mapping POSition of the first matching base. The first base in a reference
sequence has coordinate 1. POS is set as 0 for an unmapped read without coordinate. If POS is
0, no assumptions can be made about RNAME and CIGAR.
whereas the internal BAM representation is 0-based:
Quote:
pos / 0-based leftmost coordinate (= POS − 1) / int32 t / [-1]
gringer is offline   Reply With Quote
Old 11-15-2011, 09:23 AM   #9
ardmore
Member
 
Location: USA

Join Date: Jun 2011
Posts: 51
Default

Thank you.
ardmore is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:08 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO