Seqanswers Leaderboard Ad

**detq182** · 04-14-2012, 03:43 AM

Any sugestion

Please help me with that

**maubp** · 04-14-2012, 05:10 AM

It sounds like you are asking for help in generating a FASTA file with a useful description line (which you can then turn into a BLAST database). How are you making the FASTA file at the moment?

**detq182** · 04-14-2012, 05:50 AM

using perl

Im using a perl script but i cant get the original query (cDNA) insted im getting out the the protein query (blastx), i don want the protein sequence.

Code:

#!/usr/bin/perl  
use Bio::SearchIO;
$report_obj = new Bio::SearchIO(-format => 'blast',                                   
                                  -file   => 'C:\blast-2.2.25+\Lib3_consensus_dbUp.xml');   
while( $result = $report_obj->next_result ) {     
    while( $hit = $result->next_hit ) {       
      while( $hsp = $hit->next_hsp ) {
         if ( $hsp->evalue < 0.0001 ) {            
           print $result->query_name(),"\t",$hit->description(),"\n",$hsp->seq_str('query'),
           "\n";         
         }       
       }     
     }   
}

How can i put this simbol ">" before the query name?

**detq182** · 04-14-2012, 04:30 PM

anyone try to make a Db with the description+sequence?

**westerman** · 04-16-2012, 06:16 AM

Originally posted by detq182 View Post

anyone try to make a Db with the description+sequence?

Of course. Just not in the way you are doing it. It is the weekend. The question you are posing is both simple yet so specific to how you are approaching it that I do not think that anyone wanted to take the time over the weekend to try solving your problem. Especially when you post something like:

How can i put this simbol ">" before the query name?

Ah. Did you even try a

Code:

print '>'

???? People generally help others who show some initiative in solving their own problems.

**detq182** · 04-17-2012, 05:13 PM

Originally posted by westerman View Post

Of course. Just not in the way you are doing it. It is the weekend. The question you are posing is both simple yet so specific to how you are approaching it that I do not think that anyone wanted to take the time over the weekend to try solving your problem. Especially when you post something like:

Ah. Did you even try a

Code:

print '>'

???? People generally help others who show some initiative in solving their own problems.

Im sorry if i dont show some initiative in solving my problem, im in finals on the college and i started just a few days ago learning "Unix and Perl Primer for Biologists", im new in this just 2 month doing some bioinformatics, if the question is stupid im really sorry, im just starting.

hope that we are OK.

**phoss** · 04-18-2012, 06:32 AM

Hi detq182,
Why not delimit your fasta header with a special character such as colon or vertical bar?
For example:
>header | supplemental-info

This way, you can embed many annotations adjacent to your fasta header.
If I'm not mistaken, EBI-GOA follows the above convention.

**detq182** · 04-18-2012, 08:00 AM

thanks

Im going to try that

**SES** · 04-18-2012, 01:45 PM

Originally posted by detq182 View Post

Im using a perl script but i cant get the original query (cDNA) insted im getting out the the protein query (blastx), i don want the protein sequence.

Code:

#!/usr/bin/perl  
use Bio::SearchIO;
$report_obj = new Bio::SearchIO(-format => 'blast',                                   
                                  -file   => 'C:\blast-2.2.25+\Lib3_consensus_dbUp.xml');   
while( $result = $report_obj->next_result ) {     
    while( $hit = $result->next_hit ) {       
      while( $hsp = $hit->next_hsp ) {
         if ( $hsp->evalue < 0.0001 ) {            
           print $result->query_name(),"\t",$hit->description(),"\n",$hsp->seq_str('query'),
           "\n";         
         }       
       }     
     }   
}

How can i put this simbol ">" before the query name?

This is a great start, but you will need to add a couple of steps if are trying to add annotations to your original fasta file of sequences. What I mean is that printing the HSP string for the query and hit will not be the entire sequence, just the part involved in the match. If you are only interested in the match part, then just add

Code:

">".

to the beginning of your print string (following the word "print" of course). Spaces outside of the quotes don't matter, but spaces inside the quotes are important. One more thing is that you will want to delimit your header with something other than a tab, as was previously suggested. That is as easy as replacing the "\t" in the print string with "|".

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 18 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 22 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 17 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 49 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Make my own blast DB

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News