![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Learning the basics of RNA-seq | 1230Rock | General | 0 | 11-30-2011 06:56 AM |
Newbie Help | roshanpatel95 | Bioinformatics | 5 | 10-31-2011 08:39 AM |
Another Newbie.. Anyone to advise.? | teutara | Bioinformatics | 7 | 03-16-2011 12:14 PM |
fastx newbie | madsaan | Bioinformatics | 0 | 01-10-2011 11:03 AM |
hello from a newbie | kathryn | Introductions | 0 | 08-13-2008 01:36 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: Maritime Canada Join Date: Apr 2012
Posts: 5
|
![]()
Hello Everyone!
If someone wouldn't mind helping me along here I would really appreciate it... I have a bunch of sequences from different species. I've been able to identify (using BLAST) a few of them, but there are many that are unknown (no strong BLAST matches) They were all amplified with the same primer pair, but have produced amplicons of different sizes. They don't align unless there are a million (or so it seems) gaps. Some of the sequences are so divergent they don't align at all! I would eventually like to draw a tree to give insight into where an unidentified sequence belongs. Do all those gaps affect how the tree will be constructed? What do I do about the sequences that I can't align with the others?? Is there a book that might help me out with this? Any advice I can get would be great. Thanks, Kirsten |
![]() |
![]() |
![]() |
#2 |
Peter (Biopython etc)
Location: Dundee, Scotland, UK Join Date: Jul 2009
Posts: 1,543
|
![]()
If you can't align the sequences because they are too different, you shouldn't make a tree out of them.
|
![]() |
![]() |
![]() |
#3 |
Junior Member
Location: Vancouver, BC Join Date: May 2012
Posts: 6
|
![]()
To construct a tree you want the sequences to have homology, a common evolutionary origin. A good introduction to bioinformatics and trees can be found at. It's targeted at biology students so it's more straightforward to understand than most bioinformatics texts.
http://helix.biology.mcmaster.ca/courses.html As to your experiment, by using a primer pair you don't only amplify the region you are interested in, you will also amplify any other sequence that also happens to match that primer pair and can arise due to chance (remember the genome is not uniform, some sequences are more common then others). If you amplify a region in many species, in some you may be amplifying one locus, and in others you can amplify a completely different one. AB cdefghi JK where AB, JK is your primer pair and ABCDEFGHIJK is the locus you are interested in. In some species they can have AB q835%9 JK, a sequence completely unrelated in evolutionary terms and therefore you shouldn't be building a tree to compare them. Hope that helps. |
![]() |
![]() |
![]() |
#4 |
Member
Location: Raleigh, NC Join Date: Nov 2008
Posts: 51
|
![]()
What is the purpose of this work (other than the desire to draw a tree)?
|
![]() |
![]() |
![]() |
#5 |
Member
Location: Spain Join Date: Mar 2010
Posts: 36
|
![]()
try to reverse complement the sequences that don't align with the others and see if they'll align.
|
![]() |
![]() |
![]() |
#6 |
Junior Member
Location: Maritime Canada Join Date: Apr 2012
Posts: 5
|
![]()
Thanks for the replies so far.
The purpose of my work is to identify species within a mixed (and unknown composition) sample. The problem is that there is no complete reference database for me to use to identify all of my sequences. I figured a tree was my best bet at assigning some type of taxonomic identity to my unknown sequences, but now I'm seeing that some people use operational taxonomic units (OTU) with this type of work. I started looking into programs that deal with OTUs but I am already extremely intimidated by the basic programming skills required to run such programs. I don't know where to begin! Please help! |
![]() |
![]() |
![]() |
#7 |
Junior Member
Location: Germany Join Date: Aug 2011
Posts: 6
|
![]()
You could try a metagenomic program like MEGAN (http://ab.inf.uni-tuebingen.de/software/megan/). In my opinion they are easy to start, you only have to blast your reads versus a sufficient database and just import them to the program. But beware that blasting a bunch os sequences could last a lot of time. In addition to this some formats need a lot of disk space, so choosing the right ones in the start could safe you a lot of time.
|
![]() |
![]() |
![]() |
#8 |
Member
Location: Raleigh, NC Join Date: Nov 2008
Posts: 51
|
![]()
Yes, MEGAN is a useful tool for this. When you say you have a bunch of sequence do you mean 100s, 1000s, 1000000s ? Note when using MEGAN one should generally interpret the output as "these sequences are most similar to sequences in taxon X" not "these sequences are from taxon X". This is particualarly true the nearer to species level you go (MEGAN can make taxonomic assignments at multiple levels).
|
![]() |
![]() |
![]() |
Thread Tools | |
|
|