Seqanswers Leaderboard Ad

**ffinkernagel** · 09-28-2010, 07:21 AM

sure, any scripting language (such as Python) will do this, especially since both the hindIII and the ecoRi site have no ambiguity whatsoever.
Only problem might be that your files are in ABI, and you either need to convert this to something the script will be able to read (FASTA), or find an adaptor.. there is one for python http://www.bioinformatics.org/groups/?group_id=497

here's a simple script from the top of my head that will do it to a folder of fasta files...

Code:

#!/usr/bin/python
directory = 'path/to/data'
target_directory = 'path/to/output'
for filename in os.listdir(directory):
    op = open(os.path.join(directory, filename),'rb')
    fasta = op.read().split("\n")
    op.close()
    name = fasta[0][1:] #cut off >
    sequence = "".join(fasta[1:]).upper() #transform into one line of uppercase bases...
    hindIIIpos = sequence.find("AAGCTT")
    if hindIIIpos == -1:
        raise ValueError("%s did not contain a hindIII site" % filename)
    ecoRIpos = sequence.rfind("GAATTC")#search for last ecoRI site
    if ecoRIpos == -1:
        raise ValueError("%s did not contain a ecoRI site" % filename)
    cut = sequence[hindIIIpos + 1: ecoRIpos + 1] #compensate for actual cutting position
    op = open(os.path.join(target_directory, filename), 'wb')
    op.write('>%s\n%s' % (name, cut))
    op.close()

**raymoniz** · 09-28-2010, 07:38 AM

Thanks for the help. I should have also disclosed that I am not versed in any computer language - but am fortunate to have some friends who are. I will take this to them. Thank you again.

**sklages** · 09-29-2010, 05:37 AM

you might want to download 'lucy' at

AMOS - Browse Files at SourceForge.net

http://sourceforge.net/projects/amos/files/

AMOS is a collection of tools for genome assembly

or, if you prefer a GUI,

Lucy2 - ISU Complex Computation Lab

http://www.complex.iastate.edu/download/Lucy2/index.html

cheers,
Sven

**raymoniz** · 09-29-2010, 11:16 AM

Thanks very much Sven, I will give those a try.

**Richard Finney** · 09-29-2010, 12:12 PM

I am not versed in any computer language
Doing nextgen without knowing Perl/bash/c/java/sed/awk and/or python is like backpacking through South America without knowing any Spanish. You're liable to wind up lost and sick. You need to make the time to learn some basic text stream manipulation.

**sklages** · 09-29-2010, 12:28 PM

Well, the OP's questions was a bit off-topic, as he still uses PGS data (previous generation sequencing ;-) ), some 1000 ABI traces ... lucy2 should do the job. If not, I totally agree, without basic knowledge of e.g. perl it gets pretty hard ...

Sven

Topics	Statistics	Last Post
Advanced Epigenome Editing Platform Explores Gene Regulation Mechanisms by seqadmin Started by seqadmin, Yesterday, 02:46 PM	0 responses 11 views 0 likes	Last Post by seqadmin Yesterday, 02:46 PM
Telomere Maintenance by PARP1: A New Perspective in Cancer Research by seqadmin Started by seqadmin, 05-07-2024, 06:57 AM	0 responses 13 views 0 likes	Last Post by seqadmin 05-07-2024, 06:57 AM
Enhanced Neoantigen Detection: Introducing NeoHunter by seqadmin Started by seqadmin, 05-06-2024, 07:17 AM	0 responses 16 views 0 likes	Last Post by seqadmin 05-06-2024, 07:17 AM
A Close Examination at Probiotic-Related Bacteremia by seqadmin Started by seqadmin, 05-02-2024, 08:06 AM	0 responses 23 views 0 likes	Last Post by seqadmin 05-02-2024, 08:06 AM

Seqanswers Leaderboard Ad

Announcement

batch editing of ABI files

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News