SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
A working command line for Illumina's iSAAC ? oiiio Bioinformatics 23 09-10-2014 06:25 AM
blast2go command line! PSW Bioinformatics 1 12-03-2012 11:09 AM
maq2sam command not working vaibhav_jain Bioinformatics 2 01-19-2012 01:08 AM
SAMtools command line ??? Pawan Noel Bioinformatics 6 11-16-2010 10:42 AM

Reply
 
Thread Tools
Old 03-27-2015, 06:11 AM   #1
hlyates
Member
 
Location: Manhattan, Kansas

Join Date: Mar 2015
Posts: 29
Default Help getting fastq_filter.py working on command line?

I want to use galaxy's fastq_filter tool on the command line.

Basically, I already know what the inputs are required by fastq_filter.py, but not sure how to generate two of them.

After you read the python and xml file, you learn that it is expecting us to run a line something like this:
Code:
fastq_filter.py $input_file $fastq_filter_file $output_file $output_file.files_path '${input_file.extension[len( 'fastq' ):]}'
  • $input_file
  • $fastq_filter_file I don't know how to make this
  • $output_file
  • $output_file.files_path I don't know what this is or how to avoid it
  • ${input_file.extension[len( 'fastq' ):]} Seems to be type check input file type ? Not going to worry about this for now

The fastq_filter.ply is interesting. In it it has something like
Code:
def fastq_read_pass_filter( fastq_read ):
     def mean( score_list ):
         return float( sum( score_list ) ) / float( len( score_list ) )
     if len( fastq_read ) < $min_size:
         return False
     if $max_size > 0 and len( fastq_read ) > $max_size:
         return False
     num_deviates = $max_num_deviants
     qual_scores = fastq_read.get_decimal_quality_scores()
     for qual_score in qual_scores:
         if qual_score < $min_quality or ( $max_quality > 0 and qual_score > $max_quality ):
             if num_deviates == 0:
                 return False
             else:
                 num_deviates -= 1
 #if not $paired_end:
     qual_scores_split = [ qual_scores ]
 #else:
     qual_scores_split = [ qual_scores[ 0:int( len( qual_scores ) / 2 ) ], qual_scores[ int( len( qual_scores ) / 2 ): ] ]
 #end if
 #for $fastq_filter in $fastq_filters:
     for split_scores in qual_scores_split:
         left_column_offset = $fastq_filter[ 'offset_type' ][ 'left_column_offset' ]
         right_column_offset = $fastq_filter[ 'offset_type' ][ 'right_column_offset' ]
 #if $fastq_filter[ 'offset_type' ]['base_offset_type'] == 'offsets_percent':
         left_column_offset = int( round( float( left_column_offset ) / 100.0 * float( len( split_scores ) ) ) )
         right_column_offset = int( round( float( right_column_offset ) / 100.0 * float( len( split_scores ) ) ) )
 #end if
         if right_column_offset > 0:
             split_scores = split_scores[ left_column_offset:-right_column_offset]
         else:
             split_scores = split_scores[ left_column_offset:]
         if split_scores: ##if a read doesn't have enough columns, it passes by default
             if not ( ${fastq_filter[ 'score_operation' ]}( split_scores ) $fastq_filter[ 'score_comparison' ] $fastq_filter[ 'score' ]  ):
                 return False
 #end for
     return True
Is that python? Is this how the xml turns user input into a filter script? I had someone suggest I use the galaxy api for this, but that might be just as much work to get set up as getting this script to run? I'm not opposed to it, but I want to the easy way out because this is the last galaxy tool I have to run in my analysis I think before I move on to other things.

Any help and assistance would be appreciated.

Last edited by hlyates; 03-27-2015 at 06:12 AM. Reason: Added tags
hlyates is offline   Reply With Quote
Old 03-29-2015, 09:18 AM   #2
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,542
Default

The development repository is here:
https://github.com/galaxyproject/too...s/fastq_filter

Correction: The code you quoted is from the <configfile> XML snippet, it is a Python-like templating language called Cheetah.

Last edited by maubp; 03-29-2015 at 09:20 AM. Reason: correction
maubp is offline   Reply With Quote
Reply

Tags
command line, command line tool, fastq, galaxy, scripts

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:42 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO